wo 99/61458 



- 1 - 



PCT/AU99/00385 



Antigens and Tlieir D etectio n 

TECHNICAL FIELD 

The invention relates to novel nucleotide sequences 
located in a gene which encodes a bacterial flagellin 
antigen, and the use of those nucleotide vsequences for the 
detection of bacteria which express particular flagellin 
antigens, on the basis of that antigen alone, or in 
conjunction with the O antigen expressed by that strain. 

BACKGROtJND ART 

The flagellum of many bacteria appears to be made up of a 
single protein known as flagellin. The serotyping schemes 
of E. coli and Salmonella, enterica are based on highly 
variable antigenic surface structures which include the 
lipopolysaccharide which carries the O antigen and 
flagellin which is now known to be the carrier of the 
classical H antigen. In many strains of S. enterica there 
are two loci {fliC and fljB) which encode flagellin, and a 
regulatory system which allows one only to be expressed at 
any time; and which also provides for expression to 
rapidly alternate between the two forms first identified 
as two phases (HI and H2 ) for the H antigen of most 
strains. In E. coll there are 54 foirms of H antigen 
recognised and until recently they were all thought to be 
encoded at the fliC locus, as has been * shown for E, coli 
K-12. However in the 1980s Ratiner [Ratiner Y A ''Phase 
variation of the H antigen in Escherichia coli strain 
Bi327-41, the standard strain for Escherichia coli 
flagellin antigen H3 " FEMS Microbiol. Lett 15 (1982) 33- 
36; Ratiner Y A ''Presence of two structural genes 
determining antigenically different phase-specific 
flagellins in some Escherichia coli strains" FEMS 
Microbiol. Lett. 19 (1983) 37-41; Ratiner Y A ''Two genetic 
arrangements determining flagellin antigen specificities 
in two diphasic Escherichia coli strains" FEMS Microbiol. 
Lett. 29 ,(1985) 317-323; Ratiner Y A "Different alleles of 
the flagellin gene hagB in Escherichia coli standard H 
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test strains" FEMS Microbiol Lett. 48 (1987) 97-104.] 
showed that in some cases there are two loci and that 
expression can alternate. The matter was further 
complicated by a recent paper by Ratiner [Ratiner Y A 
(1998) "New f lagellin-specif ying genes in some Escherichia 
coli strains" J. Bacterid. 180 979-984] showing three 
loci (flic, fll and flm) for flagellin in addition to fliC 
although the fljB locus has not been found in E. coli. 
However E . col i s trains are normal ly iden t i f led by the 
combination of one O antigen and one H antigen [and K 
antigen when present as a capsule (K) antigen] , with no 
problems reported for the vast majority of cases with 
alternate phases, while S. enterica strains are normally 
identified by the combination of O, HI and H2 antigens. It 
is still not clear how widespread in E, coli H antigens 
determined by flagellin genes other than fliC are. 

Typing is typically carried out using specific 
antisera. The incidence of pathogenic E, coli in 

association with human and animal disease supports the 
need for suitable and rapid typing tecliniques , 

DESCRIPTION OF THE INVENTION 

In a first aspect, the present invention provides a 
novel nucleic acid molecule encoding all or part of an E. 
coli flagellin protein. 

The present invention provides, for the first time, 
full length sequence for a flagellin gene for the 
following E. coli type strains: (SEQ ID NO: 8) , H9 (SEQ 

ID NO: 11), Hlp(SEQ ID NO: 12), Ha<4(SEQ ID NO: 15), 
H18(SEQ ID NO: 18), H23 (SEQ ID NO: 22), HSKSEQ ID NO: 
50), H45(SEQ ID NO: 43), H49(SEQ ID NO : 48), H19(SEQ ID 
NO: 19), H30(SEQ ID NO: 29), H32(SEQ ID NO: 31), H26(SEQ 
ID NO: 25), H41(SEQ ID NO: 39), H15(SEQ ID NO: 16), 
H20(SEQ ID NO: 20), H28(SEQ ID NO: 27), H46(SEQ ID NO: 
44), H31(SEQ ID NO: 30), H34(SEQ ID NO: 33), H43 (SEQ ID 
NO: 41) and H52(SEQ ID NO: 51). Corrected full length 
sequences have been obtained for H7(SEQ ID NO: 9) and 
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H12{SEQ ID NO: 14) type strains. 

Partial flagellin gene sequence, including the 
central variable region, has been obtained for the 
following E. coli H type strains: H40(SEQ ID NO: 38), 
5 H8(SEQ ID NO: 10), H21 (SEQ ID NO: 21), H47 (SEQ ID NO: 46), 

HIKSEQ ID NO: 13), H17(SEQ ID NO : 17), H25(SEQ ID NO: 
24), H42(SEQ ID NO: 40), H27(SEQ ID NO: 26), H35(SEQ ID 
NO: 34), H2(SEQ ID NO: 67), H3 (SEQ ID NO: 68), H24(SEQ ID 
NO: 23), H37(SEQ ID NO: 35), H50(SEQ ID NO: 49), H4(SEQ ID 
10 NO: 6), H44(SEQ ID NO: 42), H38(SEQ ID NO: 36), H39(SEQ ID 

NO: 37), H55(SEQ ID NO: 53), H29 (SEQ ID NO: 28), H33 (SEQ 
ID NO: 32), H5(SEQ ID NO : 7), H54(SEQ ID NO: 52) and 
H56 (SEQ ID NO: 54) . 

Comparison of sequences demonstrates that unique 
15 flagellin genes have now been sequenced (partially or 

completely) for the following E. coli H type strains: HI, 
H2, H3, H5, H6, H7 , H9 , Hll, H12 , H14, H15, H18, H19, H20, 
H21, H23, H24, H25, H26, H27, H28, H29, H30, H31, H32, 
H33, H34, H35, H37, H38, H39, H41, H42, H43 , H45, H46, 
20 H48, H49, H51, H52, H54, and H56 and either H8 or H40, HIO 

or H50 and H4 or H17 . 

By comparison of these sequences, the present 
inventors were able to identify specific sequences for 
each of the above H serotypes. 
25 The present invention also provides fliC sequences 

from 10 different H7 strains, in addition to that from the 
H7 type strain, and two sequences specific to H7 of 0157 
and 055 E. coli strains. 

The present invention encompasses all or part of the 
30 flagellin genes sequenced for H2 , H3 , H5, H6, H9 , Hll, 

H14, H18, H19, H20, H21, H23, H24, H25, H26, H27 , H28, 
H29, H30, H31, H32, H33, H34, H35, H37, H38, H39, H41, 
H42, H43, H44, H45, H46, H47 , H48, H49, H51, H52 , H54, 
H55, H56, H8, H40, H15, HIO, or H50, H4 and H17 type 
35 strains. Of these flagellin genes sequenced, those from 

the type strains for H8 and H40 are identical, those from 
type strains HIO and H50 , HI and H12 , H38 and H55, H21 and 
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H47, and H4 , H17 and H44 type strains are highly similar. 

The invention also encompasses newly provided 
sequence for H7 and H12 as well as novel primers for the 
specific amplification of HI, H7 , H12 and H48 as well as 
for the other above mentioned newly sequenced flagellin 
genes . 

By cloning and expression of these sequenced 
flagellin genes in a fliC deletion E. coli K-12 strain, 
and use of anti-H antiserum, we have confirmed the H 
specificities encoded by 39 falgellin genes. The 39 H 
specificities are HI, H2 , H4, H5, H6 , H7 , H9 , HIO, Hll, 
H12, H14, H15, H16, H18, H19 , H20, H21, H23 , H24, H26, 
H27, H28, H29, H30, H31, H32, H33, H34, H38, H39, H41, 
H42, H43, H45, H46, H49, H51, H52, and H56, encoded by 
flagellin genes obtained from H type strains for HI, H2 , 
H4, H5, H6, H7, H9 , HIO, Hll, H12 , H14 , H15, H3 , H18 , H19 , 
H20, H21, H23, H24, H26, H27, H28, H29, H30, H31, H32, 
H33, H34, H38, H39, H41, H42 , H43 , H45, H46, H49, H51, 
H52, and H56 respectively. 

The nucleic acid molecules of the invention may be 
variable in length. In one embodiment they are 

oligonucleotides of from about 10 to about 20 nucleotides 
in length. The oligonucleotides of the invention are 
specific for the flagellin gene from which they are 
derived and are derived from the central region of the 
gene. In one embodiment, oligonucleotides in accordance 
with the present invention, which also include 
oligonucleotides from the previously sequenced E. coli Hi, 
H7, H12 and H48 genes, are those shown in Table 3. 

The 45 sequences (see Table 3) provide a panel to 
which newly sequenced genes can be compared to select 
specific oligonucleotides for those newly sequenced genes. 

In a second aspect the invention provides a method of 
detecting the presence of E. coli of a particular H 
serotype in a sample, the method comprising the step of 
specifically hybridising at least one nucleic acid 
molecule derived from a flagellin gene, wherein the at 
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least one nucleic acid molecule is specific for a 
particular flagellin gene associated with the H serotype, 
to any E, coli in the sample which contain the gene, and . 
detecting any specifically hybridised nucleic acid 
molecules, wherein the presence of specifically hybridised 
nucleic acid molecules identifies the presence of the H 
serotype in the sample. 

In one preferred embodiment the detection method is a 
Southern blot method. More preferably, the nucleic acid 
molecule is labelled and hybridisation of the nucleic acid 
molecule is detected by autoradiography or detection of 
fluorescence . 

Preferred nucleic acid molecules for the detection of 
particular flagellin genes are listed in Table 3 . 

In a third aspect the invention provides a method of 
detecting the presence of E. coli of a particular H 
serotype in a sample, the method comprising the step of 
specifically hybridising at least one pair of nucleic acid 
molecules to any E. coli in the sample which contains the 
flagellin gene for the particular H serotype, wherein at 
least one of the nucleic acid molecules is specific for 
the particular flagellin gene associated with the H 
serotype, and detecting any specifically hybridised 
nucleic acid molecules, wherein the presence of 
specifically hybridised nucleic acid molecules identifies 
the presence of the H serotype in the sample. 

In one preferred embodiment the detection method is a 
polymerase chain reaction method. More preferably, the 
nucleic acid molecules are labelled and hybridisation of 
the nucleic acid molecule is detected by electrophoresis . 

It is recognised that there may be instances where 
spurious hybridisation will arise through the initial 
selection of a sequence found in many different genes but 
this is typically recognisable by, for instance, 
comparison of band sizes against controls in PGR gels, and 
an alternative sequence can be selected. 
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In a fourth aspect the invention provides a method 
for detecting the presence of a particular O serotype and 
H serotype of E. coli in a sample, the method comprising 
the following steps: 

(a) specifically hybridising at least one nucleic 
acid molecule, derived from and specific for a gene 
encoding a transferase or a gene encoding an enzyme for 
the transport or processing of a polysaccharide or 
oligosaccharide unit, the gene being involved in the 
synthesis of a particular E. coli O antigen, to any E. 
coli in the sample which contain the gene; 

(b) specifically hybridising at least one nucleic 
acid molecule derived from and specific for a particular 
flagellin gene associated with that H serotype, to any E, 
coli in the sample which contain the gene; and 

(c) detecting any specifically hybridised nucleic 
acid molecules. 

Preferred nucleic acid molecules for the detection of 
particular flagellin genes are listed in Table 3 . 

In one preferred embodiment, the sequence of the 
nucleic acid molecule specific for the O antigen is 
specific to the nucleotide sequence encoding the 0111 
antigen. More preferably, the sequence is derived from a 
gene selected from the group consisting of whdH 
(nucleotide position 739 to 1932 of Figure 5), wzx 
(nucleotide position 8646 to 9911 of Figure 5), wzy 
(nucleotide position 9901 to 10953 of Figure 5) , whdM 
(nucleotide position 11821 to 12945 of Figure 5) and 
fragments of those molecules of at least 10-12 nucleotides 
in length. Particularly preferred nucleic acid molecules 
are those set out in Tables 8 and 8A, with respect to the 
above mentioned genes . 

In another preferred embodiment, the sequence of the 
nucleic acid molecule specific for the O antigen is 
specific to the nucleotide sequence encoding the 0157 
antigen. More preferably, the sequence is derived from a 
gene selected from the group consisting of wbdN 
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(nucleotide position 79 to 861 of Figure 6) , whdO 
(nucleotide position 2011 to 2757 of Figure 6), wbdP 
(nucleotide position 5257 to 6471 of Figure 6) , wbdR 
(nucleotide position 13156 to 13821 of Figure 6), wzx 
(nucleotide position 2744 to 4135 of Figure 6) and wzy 
(nucleotide position 858 to 2042 of Figure 6) and 
fragments of those molecules of at least 10-12 nucleotides 
in length. Particularly preferred nucleic acid molecules 
are those set out in Tables 9 and 9A, with respect to the 
above mentioned genes. 

In one preferred embodiment the detection method is a 
Southern blot method. More preferably, the nucleic acid 
molecule is labelled and hybridisation of the nucleic acid 
molecule is detected by autoradiography or detection of 
fluorescence , 

In a fifth aspect the invention provides a method for 
detecting the presence of a particular O serotype and H 
serotype of E. coli in a sample, the method comprising the 
following steps: 

(a) specifically hybridising at least one pair of 
nucleic acid molecules, at least one of which is derived 
from and specific for a gene encoding a transferase or a 
gene encoding an enzyme for the transport or processing of 
a polysaccharide or oligosaccharide unit, the gene being 
involved in the synthesis of the particular E. coli O 
antigen, to any E, coli in the sample which contain the 
gene; 

(b) specifically hybridising at least one pair of* 
nucleic acid molecules, at least one of which is derived 
from and specific for a particular flagellin gene 
associated with the particular H serotype, to any K. coli 
in the sample which contain the gene; and 

(c) detecting any specifically hybridised nucleic 
acid molecules. 

Preferred nucleic acid molecules for the detection of 
particular flagellin genes are listed in Table 3. 
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In one preferred embodiment, the sequence of the 
nucleic acid molecule specific for the O antigen is 
specific to the nucleotide sequence encoding the 0111 
antigen. More preferably, the sequence is derived from a 
gene selected from the group consisting of whdH 
(nucleotide position 739 to 1932 of Figure 5) , wzk 
(nucleotide position 8646 to 9911 of Figure 5) , wzy 
(nucleotide position 9901 to 10953 of Figure 5) , wbdM 
(nucleotide position 11821 to 12945 of Figure 5) and 
fragments of those molecules of at least 10-12 nucleotides 
in length. Particularly preferred nucleic acid molecules 
are those set out in Tables Sand 8A, with respect to the 
above mentioned genes. 

In another preferred embodiment, the sequence of the 
nucleic acid molecule specific for the O antigen is 
specific to the nucleotide sequence encoding the 0157 
antigen. More preferably, the sequence is derived from a 
gene selected from the group consisting of ivjbdZ\7'{nucleotide 
position 79 to 861 of Figure 6) , whdO (nucleotide position 
2011 to 2757 of Figure 6), wbdP (nucleotide position 5257 
to 6471 of Figure 6) , whdR (nucleotide position 1315 6 to 
13821 of Figure 6) , wz^ (nucleotide position 2744 to 4135 
of Figure 6) and wzy (nucleotide position 858 to 2042 of 
Figure 6) and fragments of those molecules of at least 10- 
12 nucleotides in length. Particularly preferred nucleic 
acid molecules are those set out in Tables 9 and 9A, with 
respect to the above mentioned genes . 

In one preferred embodiment the detection method is a 
polymerase chain reaction method. More preferably, the 
nucleic acid molecules are labelled and hybridisation of 
the nucleic acid molecule is detected by electrophoresis . 

The present inventors believe that based on the 
teachings of the present invention and available 
information concerning O antigen gene clusters, and 
through use of experimental analysis, comparison of 
nucleic acid sequences or predicted protein structures, 
nucleic acid molecules in accordance with the invention 
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can be readily derived for any particular O antigen of 
interest. Suitable bacterial strains can typically be 
acquired coinmercially from depositary institutions. 

There are currently 166 defined E. coli O antigens. 
Samples of the 166 different E. coli O antigen 
serotypes are available from Statens Serum Institut, 
Copenhagen , Denmark . 

The inventors envisage rare circumstances whereby two 
genetically similar gene clusters encoding serologically 
different 0 antigens have arisen through recombination of 
genes or mutation so as to generate polymorphic variants . 

In these circumstances multiple pairs of oligonucleotides 
may be selected to provide hybridisation to the specific 
combination of genes . The invention thus envisages the 
use of a panel containing multiple nucleic acid molecules 
for use in the method of testing for O antigen in 
conjunction with H antigen, wherein the nucleic acid 
molecules are derived from genes encoding transferases 
and/or enzymes for the transport or processing of a 
polysaccharide or oligosaccharide unit including wzx or 
wzy genes, wherein the panel of nucleic acid molecules is 
specific to a particular O antigen. The panel of nucleic 
acid molecules can include nucleic acid molecules derived 
from O antigen sugar pathway genes where necessary. 

The inventors also found two mutated flagellin genes 
from H type strains for H35 and H54 which have insertion 
sequences inserted into normal flagellar genes identical 
or near identical to that that of the Hll and H21 type 
strains respectively. Thus, primers for Hll and H21 
(listed in Table 3) would also amplify fragments in H3 5 
and H54, which differ in sizes to those in Hll and H21 
respectively. The inventors also provide two pairs of 
primers each for H3 5 and H54 based on the insertion 
sequence (see H35 and H54 columns in Table 3) . The use of 
one of them in combination with one of the Hll or H21 
primers will generate a PGR band only in H3 5 or H54 
respectively, and this will also differentiate H3B and H54 
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from Hll and H21 respectively. 

The present invention also relates to methods of 
detecting the presence of particular E, coli H antigens or 
H antigen and O antigen combinations where one or more 
nucleic acicj molecules which generate a particular size 
fragment indicative of the presence of that H antigen are 
used or in which the combination of one antigen specific 
primer for that H antigen with another primer for a 
related H antigen provides for the detection of the 
particular H antigen by hybridisation to the relevant 
gene. Preferably, the H antigen is Hll, H21, H3 5 or H54 . 

The pairs of nucleic acid molecules where the method 
of the fifth aspect is used may both hybridise to the 
relevant H or O antigen gene or alternatively only one may 
hybridise to the relevant gene and the other to another 
site. 

The inventors recognise in applying the methods of 
the invention for detecting combinations of O and H 
antigens to samples, that the methods do not indicate 
whether a positive result for a particular O and H antigen 
combination arises because the O and H antigen are 
present on a single E. coli strain present in the sample 
or are present on different E. coli strains present in the 
sample. Because the ability to identify the presence of 
E. coli strains with particular O and H antigen 
combinations is highly desirable (due to the relationship 
between particular combinations and pathogenicity) the 
determination that a particular combination is present in 
a sample can be followed by isolation of single colonies 
and checking whether the they contain the relevant 
combination by using the same method again or using 
antibody labelled magnetic beads to separate cells 
expressing the particular O or H antigen and then testing 
the isolated cells for the other serotype. 

In addition, as mentioned above, the present 
inventors have established the existence of H7 primers 
specific to the 0157 and 055 serotypes. Using such 
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primers it is possible to detect particular O and H 
antigen combinations with the use of H specific nucleic 
acid molecules. 

In a sixth aspect the invention provides a method for 
detecting the presence of a particular O serotype and H 
serotype of E. coli in a sample, the method comprising the 
following steps : 

(a) specifically hybridising at least one nucleic 
acid molecule, derived from and specific for a gene 
encoding a flagellin associated with a particular E. coli 

H antigen serotype to any E. coli carrying the gene and 

present in the sample; 

and 

(b) detecting the at least one specifically 
hybridised nucleic acid molecule, wherein the at least one 
nucleic acid molecule is specific for the particular 
combination of O and H antigen. 

Preferably the combination is 055:H7 or 0157:H7. 

The ability to detect the 0157:H7 combination from a 
particular H7 primer or pair is of particular use given 
the association of this combination with pathogenic 
strains . 

In a seventh aspect the present invention provides a 
method for testing a food derived sample for the presence 
of one or more particular E. coli O antigens and H 
antigens comprising testing the sample by a method of the 
fourth, fifth or sixth aspect the invention. 

In an eighth aspect the present invention provides a 
method for testing a faecal derived sample for the 
presence of one or more particular E. coli O antigens and 
H antigens comprising testing the sample by a method of 
the fourth, fifth or sixth aspect the invention. 

In a ninth aspect the present invention provides a 
method for testing a patient or animal derived sample for 
the presence of one or more particular E. coli O antigens 
and H antigens comprising testing the sample by a method 
of the fourth, fifth or sixth aspect the invention. 




wo 99/61458 



PCT/AU99/0038S 



- 12 - 



In 

25 



Preferably, the method of the seventh, eighth or 
ninth aspect of the invention is a polymerase chain 



molecules for use in the method are labelled. Even more 
preferably the hybridised nucleic acid molecules are 
detected by electrophoresis. 

In the above described methods it will be understood 
that where pairs of nucleic acid molecules are used one of 
the nucleic acid molecules may hybridise to a sequence 
that is not from the O antigen transferase, wzk or wzy 
gene or the flagellin gene. Further where both hybridise 
to these genes the O antigen molecules may hybridise to 
the same or a different one of these genes. 

In a tenth aspect the present invention provides a kit 
for identifying the H serotype of E. coli, the kit 
comprising: 

at least one nucleic acid molecule derived from and 
specific for an E. coli flagellin gene. 

In an eleventh aspect the present invention provides 
a kit for identifying the H and O serotype of E. coli, the 
kit comprising: 

(a) at least one nucleic acid molecule derived from and 
specific for an E. coli flagellin gene; and 

(b) at least one nucleic acid molecule derived from 
and specific for a gene encoding a transferase or a gene 
encoding an enzyme for the transport or processing of a 
polysaccharide or oligosaccharide unit, the gene being 
involved in the synthesis of a particular E. coli O 
antigen. 

The nucleic acid molecules may be provided in the 
same or different vials. The kit may also provide in the 
same or separate vials a second set of specific nucleic 
acid molecules . 

Particularly preferred nucleic acid molecules for 
inclusion in the kits are those specified in Tables 3, 8, 
8A, 9 and 9A as described above. 



reaction method. 



More preferably the oligonucleotide 
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DEFINITIONS 

In this specification, we have used term "flagellin gene" 
in many cases where previously one would have used '"jfliC", 
5 to allow for the uncertainty as to locus introduced by 

recent observations. However, uncertainty as to the locus 
does not alter the fact that most E. coli strains express 
a single H antigen and that a single flagellin gene 
sequence per strain is required to give the genetic basis 

10 for H antigen variation . Any use of the name fliC in this 

specification where a different locus is later shown to be 
involved would not affect the validity of conclusions 
drawn regarding application of information based on the 
sequence, where the conclusions do not relate to the map 

15 position. Thus it is generally the nucleic acid molecule 

itself which is of importance rather than the name 
attributed to the gene. When it is known or suspected 
that the gene encoding the H antigen is not in the fliC 
locus, we use the term flagellin rather than fliC. 

20 The phrase, "a nucleic acid molecule derived from a 

gene" means that the nucleic acid molecule has a 
nucleotide sequence which is either identical or 
substantially similar to all or part of the identified 
gene. Thus a nucleic acid molecule derived from a gene 

25 can be a molecule which is isolated from the identified 

gene by physical separation from that gene, or a molecule 
which is artificially synthesised and has a nucleotide 
sequence which is either identical to or substantially 
similar to all or part of the identified gene. While some 

30 workers consider only the DNA strand with the same 

sequence as the mRNA transcribed from the gene, here 
either strand is intended. 

Transferase genes are regions of nucleic acid which 
have a nucleotide sequence which encodes gene products 

35 that transfer monomeric sugar units. 

Flippase or wzx genes are regions of nucleic acid 
which have a nucleotide sequence which encodes a gene 
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product that flips oligosaccharide repeat units generally 
composed of three to six monomeric sugar units to the 
external surface of the membrane. 

Polymerase or wzy genes are regions of nucleic acid 
which have a nucleotide sequence which encodes gene 
products that polymerise repeating oligosaccharide units 
generally composed of 3-6 monomeric sugar units. 

The nucleotide sequences provided in this 
specification are described as anti-sense sequences. This 
term is used in the same manner as it is used in Glossary 
of Biochemistry and Molecular Biology Revised Edition, 
David M. Click, 1997 Portland Press Ltd., London on page 
11 where the term is described as referring to one of the 
two strands of double -stranded DNA usually that which has 
15 the same sequence as the mRNA. We use it to describe this 

strand which has the same sequence as the mRNA. 
NOMKNCU^TURB 

Synonyms for E. coli Olll rfh 



10 



on 

20 

m 



25 



30 



35 



Current names 


Our names 


Bastin et al . 19: 


wbdH 


orfl 




gmd 


orf2 




wbdl 


orf3 


orf3 .4* 


manC 


orf4 


rfbM* 


manB 


orf 5 


rfbK* 


wbdJ 


orf6 


orf 6. 7* 


wbdK 


orf7 


orf 7 .7* 


wzx 


orf 8 


orf 8 . 9 and rfbX* 


wzy 


orf9 




wbdL 


orf 10 




wbdM 


orfll 





* Nomenclature according to Bastin D.A. , et ai . 1991 "'Molecular 
cloning and expression in Escherichia coli K-12 of the rfb gene 
cluster determining the O antigen of an coli 0111 strain". Mol . 
Microbiol. 5:9 2223-2231, 

Other Synonyms 



wzy rfc 

wzx rfbX 

rmlA rfbA 

40 rmlB rfbB 

rmlC rfbC 

rmlD rfbD 

glf orf6* 

^^^I orf3#, orfS* of E. coli K-12 
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wbbJ orf2#, orf9* of coli K-12 

wbbK orfl#, orflO* of coli K-12 

wbbL orf5#, orf 11* of coli K-12 

# Nomenclature according to Yao, Z. And M. A. Valvano 1994. 

5 "Genetic analysis of the 0-specific lipopolysaccharide biosynthesis 

region (rfb) of Eschericia coli K-12 W3110: identification of genes 
the confer groups-specif icty to Shigella flexineri serotypes Y and 
4a". J. Bacteriol. 176: 4133-4143. 

* Nomenclature according to Stevenson et al . 1994. ''Structure of 
10 the 0-antigen of E. coli K-12 and the sequence of its rfb gene 

cluster". J. Bacteriol 176: 4144-4156. 

• The O antigen genes of many species were given rfb names ( rfbA 
etc) and the O antigen gene cluster was often referred to as the 
^fb cluster. There are now new names for the rfb genes as shown 

'5 in the table. Both terminologies have been used herein, depending 

on the source of the information. 
In the claims that follow and in the summary of the 
invention, except where the context requires otherwise due 
iltf*' to express language or necessary implication, the word 

20 ^'comprising" is used in the sense of including", i.e. the 

fij: features specified may be associated with further features 

i in various embodiments of the invention, 

■,;i!t.i 

m 

BRIEF DESCRIPTION OF THE DRAWINGS 

-5 Figure 1 shows Eco Rl restriction maps of cosmid 

J: clones PPR1054, pPR1055, pPR1056, pPRl058, pPR1287 which 

are subclones of E. coli Gill O antigen gene cluster. The 
thickened line is the region common to all clones. 
Broken lines show segments that are non-contiguous on the 

30 chromosome. The deduced restriction map for E. coli 

strain M92 is shown above. 

Figure 2 shows a restriction mapping analysis of E. 
coli Gill O antigen gene cluster within the cosmid clone 
PPR1058. Restriction enzymes are: (B: BamHl; Bg: Bglll, 

35 E: EcoRl; H: Hindlll ; K: Kpnl ; P: Ps tl; S: Sal J and X: 

Xhol. Plasmids pPR1230, pPR1231, and pPR1288 are 

deletion derivatives of pPR1058. Plasmids pPR 1237, 
PPR1238, PPR1239 and pPR1240 are in pUC19 . Plasmids 
PPR1243, PPR1244, pPR1245, pPR1246 and pPR1248 are in 

40 PUC18, and pPR1292 is in pUC19. Plasmid pPR1270 is in 
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PT7T319U. Probes 1, 2 and 3 were isolated as internal 
fragments of pPR1246, pPRl243 and pPRl237 respectively. 
Dotted lines indicate that subclone DNA extends to the 
left of the map into attached vector. 

Figure 3 shows the structure of E, coli 0111 O 
antigen gene cluster. 

Figure 4 shows the structure of E, coli 0157 O 
antigen gene cluster. 

Figure 5 shows the nucleotide sequence (SEQ ID NO: 
45) of the E. coli 0111 O antigen gene cluster. Note: (1) 
The first and last three bases of a gene are underlined 
and of italic respectively.; (2) The region which was 
previously sequenced by Bastin and Reeves 1995 "Sequence 
and anlysis of the O antigen gene (rfb) cluster of 
Escherichia coli Olll" Gene 164: 17-23 is marked. 

Figure 6 shows the nucleotide sequence (SEQ ID NO: 
56) of the E. coli 0157 O antigen gene cluster. Note: (1) 
The first and last three bases of a gene (region) are 
underlined and of italic respectively (2) The region 
previously sequenced by Bilge et al . 1996 "Role of the 
Escherichia coli 0157-H7 O side chain in adherence and 
analysis of an rfb locus". Inf. and Imihun 64:4795-4801 is 
marked . 

Figures 7 to 9 show the nucleotide sequences (SEQ ID 
NOS: 66 to 68 respectively) obtained for flagellin genes 
from E, coli type strains for HI to H3 respectively. The 
primer positions listed in Table 3 are based on treating 
the first nucleotide of each of these sequences as No. 1. 

Figures 10 to 18 show the nucleotide sequences (SEQ 
ID NOS: 6 to 14 respectively) obtained for flagellin genes 
from E. coli type strains for H4 to H12 respectively. The 
primer positions listed in Table 3 are based on treating 
the first nucleotide of each of these sequences as No . 1. 

Figures 19 and 20 show the nucleotide sequences (SEQ 
ID NOS: 15 to 16 respectively) obtained for flagellin 
genes from E. coli type strains for H14 and H15 
respectively. The primer positions listed in Table 3 are 
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based on treating the first nucleotide of each of these 
sequences as No. 1. 

Figures 22 and 26 show the nucleotide sequences (SEQ 
ID NOS: 17 to 21 respectively) obtained for flagellin 
genes from E, coll type strains for H17 and H21 
respectively. The primer positions listed in Table 3 are 
based on treating the first nucleotide of each of these 
sequences as No. 1. 

Figures 27 to 39 show the nucleotide sequences (SEQ 
ID NOS: 22 to 34) obtained for flagellin genes from E. 
coli type strains for H23 to H35 respectively. The primer 
positions listed in Table 3 are based on treating the 
first nucleotide of each of these sequences as No. 1. 

Figures 40 to 49 show the nucleotide sequences (SEQ 
ID NOS: 3 5 to 44) obtained for flagellin genes from E. 
coli type strains for H37 to H46 respectively. The primer 
positions listed in Table 3 are based on treating the 
first nucleotide of each of these sequences as No. 1. 

Figures 50 to 55 show the nucleotide sequences (SEQ 
ID NOS: 46 to 51) obtained for flagell in genes from E, 
coli type strains for H47 to H52 respectively. The primer 
positions listed in Table 3 are based on treating the 
first nucleotide of each of these sequences as No. 1, 

Figures 56 to 58 show the nucleotide sequences (SEQ 
ID NOS: 52 to 54) obtained for flagellin genes from E. 
coli type strains for H54 to H56 respectively. The primer 
positions listed in Table 3 are based on treating the 
first nucleotide of each of these sequences as No. 1. 

Figure 59 shows the nucleotide sequence (SEQ ID NO: 
55) obtained for the flagellin gene from E, coli H7 strain 
M1179. The primer positions listed in Table 3 are based 
on treating the first nucleotide of each of these 
sequences as No. 1. 

Figures 60 to 68 show the nucleotide sequences (SEQ 
ID NOS: 57 to 65 respectively) obtained for flagellin 
genes from E. coli strains M1004, M1211, M1200, M1686, 
M1328, M917, M527, M973 , and M918 respectively. The primer 
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positions listed in Table 3 are based on treating the 
first nucleotide of each of these sequences as No . 1. 

Figure 69 shows the nucleotide sequence (SEQ ID NO: 

1) of the flic gene and DNA flanking the flic gene from 
the H25 type strain. 

Figure 7 OA shows the nucleotide sequence (SEQ ID NO: 

2) obtained from the 5' end of the insert of plasmid 
PPR1989. The insert of plasmid pPR1989 encodes the second 
flagellin gene of the H55 type strain. 

Figure 7 OB shows the nucleotide sequence (SEQ ID NO: 

3 ) obtained from the 3 ' end of the insert of plasmid 
pPRl989. The insert of plasmid pPR1989 encodes the second 
flagellin gene of the H55 type strain. 

Figure 71 shows the nucleotide sequence (SEQ ID NO: 4) 
obtained from the 5' end of the insert of plasmid pPR1993 . 

The insert of plasmid pPR1993 encodes the second 
flagellin gene of the H36 strain. 

Figure 72 shows the nucleotide sequence (SEQ ID NO: 5) 
obtained from the 3 ' end of the insert of plasmid pPR1993 . 

The insert of plasmid pPR1993 encodes the second 
flagellin gene of the H3 6 type strain. 

Figure 73 A shows the sequence of polylinker and the 
SD sequence of plasmid pTrc99A. 

Figure 73B shows the sequence of the junction region 
between the SD sequence and the start of flagellin gene in 
the plasmids used for the expression of flagellin genes. 

BEST METHOD OF CARRYING OUT THE INVENTION 

In carrying out the methods of the invention with 
respect to the testing of particular sample types 
including samples from food, patients, animals and faeces 
the samples are prepared by routine techniques routinely 
used in the preparation of such samples for DNA based 
testing. The steps for testing the samples using 

particular nucleic acid molecules in assay formats such as 
Southern blots and PGR are performed under routinely 
determined conditions appropriate to the sample and the 
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nucleic acid molecules. 
H antigen 

Materials and Methods 

1. Bacterial strains and plasmid: 

There are 54 H types in E. coli [Ewing, W.H. : Edwards 
and Ewing's identification of the E'nteirojbacteriaceae. , 
Elsevier Science Publishers, Amsterdam, The Netherlands, 
1986] : note H antigens from 1 to 57 were listed and that 
13, 22 and 57 are not valid. All the standard H type 
strains except H16 were obtained from the Institute of 
Medical and Veterinary Science, Adelaide, Australia. The 
primary stocks are hold at the Statens Serum Institut, 
Copenhagen , Denmark . 

The additional H7 strains used are listed in Table 1. 

We do not have the type strain for H16. It is known 
that the H3 type strain is biphasic and can also express 
the H16 flagellin gene [Ratiner, Y. A. (1985) ^^Two genetic 
arrangements determining flagellar antigen specificities 
in two diphasic E. coli strains. FEMS Microbiol Lett 19: 
317-323] . We have sequenced and cloned the H16 flagellin 
gene from the H3 type strain (see below) . 

E, coli K-12 strain C600 hsm hsr fliC: :TnlO 
[Kuwajiwa, G. (1988) "Flagellin domain that affects H 
antigenicity of E. coli K-12" J. Bacteriol . 170; 485-488] 
(laboratory stock no. M2126) was obtained from Dr Benita 
Wester lund-Wikstrom of the Department of Biosciences, 
University of Helsinkin, Finland. E. coli K-12 strain 
EJ2282 (laboratory no. P5560) is a JfliC deletion strain, 
and was obtained from Dr Masatoshi Enomoto of the 
Department of Biology, Okayama University, Japan 
[Tominaga, A. M. A.-H. Matimound, T. Mokaihara and M. 
Enomoto (1994) '^Molecular characterization of intact but 
cryptic, flagellin genes in the genus Shigella,: Mol . 
Microbiol. 12: 277-285]. 

Plasmid pTrc99A was purchased from Pharmacia LKB 
(Melbourne, VIC, Australia) . 
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2. Antlsera. 

Antisera against HI, H3 , H8, H14, H15, H17, H23, H24, 
H25, H26, H29, H30, H31, H32, H33, H35, H36, H37, H38, 
H39, H43, H44, H46, H47, H48, H49, H52 , H53, H54, H55, and 
H56 were obtained from the Institute of Medical and 
veterinary Science, Adelaide, Australia, Antisera against 
H2, H4, H5, H6, H7 , H9 , HIO, Hll , H12 , H16, H18, H19, H20, 
H21, H27, H28, H34, H40, H41, H42, H45, and H51 were 
obtained from Denka Seiken Co., Ltd, Tokyo, Japan. 

Antisera to type HBO was not available from any known 
source , 

The antisera available were checked against the 
appropriate type strains to confirm the specificities of 
both flagellin H antigen and H antisera: 52 sera (all 
those except anti-H16 serum listed above) gave a positive 
reaction with the corresponding type strains for that 
serum. 



3. Agglutination test: 

Bacteria from 1 ml of an overnight culture grown in 
Luria broth (Difco Tryptone, lOg/1; Difico yeast extract, 
5g/l; NaCl, 0.5 g/1; pH 7.2) at 30oC was centrifuged ( 
4000 rpm/10 min) and the bacteria pellet resuspended in 
100 ml of saline. The agglutination test was carried out 
by mixing equal volumes (5 ml) of both the cells and 
antiserum on a slide. The slide was rocked for 1 minute 
and then observed for agglutination. For all agglutination 
tests, saline containing no antiserum was mixed with cells 
to be used as a negative control. 

For testing the H specificities of strain M212 6 or 
strain P5560 carrying plasmid containing cloned flagellin 
genes, cells of M2126 or P5560 were used as an additional 
negative control . 

All agglutination tests were first carried out using 
undiluted antisera (note that the antisera we used have 
been diluted before reaching our hands), except for anti- 
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Hll, anti-H34, anti-H52 and anti-H26 serum for which we 

used 1:10 dilutions to avoid background agglutination. In 

cases for which cross -reactions have been reported, we 

carried out agglutination tests using serial dilutions of 
sera (see section 10.1) 

4. Motility test: 

The motility of strain M2126 or strain P5560 carrying 
cloned flagellin genes was examined microscopically. 1 ml 
of overnight culture grown in Luria broth (Difco Tryptone, 
lOg/1; Difico yeast extract, 5g/l; NaCl, 0.5 g/1; pH 7.2) 
at 30oC was inoculated into 10 ml of Luria broth, and the 
culture was shaken at 100 rpm at 3 0oC to early log phase 
(OD 625 = 0.2) . A loopful of culture was placed on a slide 
and examined under a microscope. Motility of individual 
cells was easily distinguished from Brownian movement and 
streaming, and presence or absence of motility recorded. 

5. Isolation of chromosomal DNA: 

Chromosomal DNA from all the 53 H type strains and 
the strains listed in Table 1 was isolated using the 
Promega Genomic isolation kit {Madison WI USA) . Each 
chromosomal DNA sample was checked by gel electrophoresis 
of the DNA and by PGR amplification of the mdh gene using 
oligonucleotides based on the E. coli K-12 mdh gene [Boyd, 
E.F., Nelson, K. , Wang, F.-S., Whittam, T.S. and Selander, 
R.K. ; Molecular genetic basis of allelic polymorphism in 
malate dehydrogenase (mdh) in natural populations of 
Escherichia coli and Salmonella enterica. Proc, Natl. 
Acad. Sci. USA 91 (1994) 1280-1284]. 

6. PGR amplification of flagellin gene: 

Flagellin genes from different strains were first 
PGR amplified using one of the following four pairs of 
oligonucleotides : 

#1285 (5 ' -atggcacaagtcattaatac) and 
#1286 (5' -ttaaccctgcagtagagaca) ; 
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#1417 ( 5 ' -ctgatcactcaaaataatatcaac ) and 

#1418 (5 ' -ctgcggtacctggttggc) ; 

#1431 (5 ' -atggcacaagtcattaatacccaac) and 

#1432 {5 ' -ctaaccctgcagcagagaca) : 

#1575 ( 5 ' -gggtggaaacccaatacg) and 

#1576 (5 ' -gcgcatcaggcaatttgg) 

PGR reactions were carried out under the following 
conditions: denaturing, 94°C/30' ; annealing, temperature 
varies (refer to Table 2)/30'; extension, 72^C/1'; 30 
cycles. The PGR product was purified using the Promega 
Wizard PGR purification kit (Madison WI USA) before being 
sequenced. 

The H36 and H53 type strains gave two PGR bands using 
primer pairs #1431/#1432 and #1417/#1418 respectively, and 
were not sequenced. 

7. Enzymes and buffers: 

Restriction endonucleases and DNA T4 ligase were 

purchased from Boehringer Mannheim (Gastle Hill, NSW, 

Australia) . Restriction enzymes were used in the 
recommended commercial buffer. 

8. Sequencing- of the flagellin genes: 

Each PGR product was first sequenced using the 
oligonucleotide primers used for the PGR amplification. 
Primers based on the obtained sequence were then used to 
sequence further, and this procedure was repeated until 
the entire PGR product was sequenced. 

The sequencing reactions were performed using the 
DyeDeoxy Terminator Gycle Sequencing method (Applied 
Biosystems, GA, USA), and reaction products were analysed 
using fluorescent dye and an ABI377 automated sequencer 
(GA, USA) . 

Sequence data were processed and analysed using 
Staden programs [Sacchi GT, Zanella R G, Gaugant D A, 
Frasch G E, Hidalgo N T, Milagres L G, Pessoa L L, Ramos S 
R, Gamargo M G G and Melles G E A Emergence of a new 



wo 99/61458 



PCT/AU99/00385 



- 23 - 



5 



10 



m 

Q 

m 

i"^' 20 
111 



25 



30 



35 



clone of serogroup C Neisseria meningitidis in Sao Paulo, 
Brazil" J. Clin. Microbiol. 30 (1992) 1282-1286; 
Staden, R. : Automation of the computer handling of gel 
reading data produced by the shotgun method of DNA 
sequencing. Nucl . Acids Res. 10 (1982a) 4731-4751; 
Staden, R. : An interactive graphics program for comparing 
and aligning nucleic acid and amino acid sequences . Nucl . 
Acids Res. 10 {1982b) 2951-2961; 

Staden, R. : Computer methods to locate signals in nucleic 
acid sequences. Nucl. Acids Res. 12 (1984a) 505-519; 
Staden, R. : Graphic methods to determine the function of 
nucleic acid sequences. A summary of ANALYSEQ options. 
Nucl. Acids Res. 12 (1984b) 521-538; 

Staden, R. : The current status and portability of our 
sequence handling software. Nucl. Acids Res. 14 (1986) 
217-231] . 

We were able to PCR amplify flagellin genes from H 
type strains for H7 , 23, 12, 51, 45, 49, 19, 9, 30, 32, 
26, 41, 15, 20, 28, 46, 31, 14, 18, 6, 34, 48, 43, 10, 52, 
and also from H7 strains ml004, m527, ml686, ml211, ml328, 
m973, mll79, ml200, m917, and m918 using primers #1575 and 
#1576 which are based on sequences 51-34^ bp^ upstream and 
3J7-54 bp downstream of start and end of the E. coli K-12 
flic gene r espec tiyely . Thus , the f ul 1 s equence of the 
flagellin gene from these strains was obtained and the use 
of flanking sequence for primers makes it highly likely 
that they are at the fliC locus. 

For other strains, we were only able to amplify the 
flagellin gene using one or more of the other three pairs 
of primers, which are based on sequence within the fliC 
gene, and thus only partial sequence was obtained. These 
ampl icons may be of the flic gene or one of the 
alternative flagellin genes. The flagellin gene sequences 
from H type strains for H40, 8, 21, 47, 11, 27, 35, 2, 3, 
24, 37, 50, 4, 44, 38, 55, 29, 33, 5, and 56 obtained are 
lacking 18 and 14 codons at 5' and 3' ends respectively. 
The flagellin gene sequence of H3 9 obtained using primers 
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#1285/#1286 lacks 18 and 19 codons at 5 ' and 3 ' ends 
respectively. The flagellin gene sequence of H type 
strains of H17, 25 and 42 lack 23 and 21 codons at 5' and 
3' ends respectively. The flagellin gene sequence of the 
H type strain for H54 lacks 23 and 12 codons at the 5' and 
3' ends respectively. There is very little variation in 
the sequence at the two ends of flagellin genes and 
antigenic variation is due to variation in the central 
region of the gene. The absence of sequence for the ends 
of some of the flagellin genes is not important for the 
purpose of the present invention relating to the detection 
of antigenic variation by DNA sequence based means. 

The flic genes from H type strains of HI, H7 and H12 
have been sequenced previously [Schoenhals, G. and 
Whitfield, C: Comparative analysis of flagellin sequences 
from Esaherichia coll strains possessing serologically 
distinct flagellar filaments with a shared complex surface 
pattern. J. Bacteriol. 175 (1993) 5395-5402] and .we did 
not sequence the gene from the HI strain. 

We have sequenced flic genes from a set of H7 strains 
with different O antigens, including that of flic from the 
H7 type strain as one of the set: we have found four 
differences from the published H7 sequence (GenBank 
accession number L07388) which we believe are due to 
errors in the published sequence. 

We have also re-sequenced the flic gene from the H12 
type strain, and have found one difference from the 
pxiblished H12 sequence (GenBank accession number L07389) 
which we believe is due to an error in the published 
sequence . 

The flagellin genes from type strains H35 and H54 
were also amplified using primers #1431/#1432, which are 
based on sequence within the flic gene. Sequence data 
revealed that these two genes would be non- functional due 
to insertion sequence inserted in the middle of them We 
have sequenced them to facilitate selection of primers for 
the functional flagellin genes. 
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9. Cloning of flagellln genes 

DNA was digested for 2 hr at 37°C with appropriate 
restriction enzyme (s). The reaction product was then 
extracted once with phenol, and twice with ether. DNA was 
precipitated with 2 vols of ethanol and resuspended in 
water before the ligation reaction was carried out. 
Ligation was carried out 0/N at 4°C and the ligated DNA was 
electroporated into one of the E. coli flic mutant 
strains . 

9.1. Cloning of flagellin genes from type strains for HI, 
H2, H3, H5, H6, H7 , H9 , HIO , Mil, H12, H14, H15, H18, H19 , 
H20, H21, H24, H26, H27, H28, H29, H31, H34, H38, H39, 
H41, H42, H43, H45, H46, H49, H51, H52, and H56: 

The full flagellin gene was PGR amplified using 
primers #1868 and #1870 (Table 3A) . Both these primers 
are based on the sequences of the H7 flagellin gene of the 
H7 type s train . #1868 is the 5 ' pr imer : there is an i\7coI 
site incorporated into the primer (Table 3B) and the 
flagellin gene starts at base 3 of the I^col site. The 3' 
primer #1870 has a BamHI site incorporated downstream of 
the stop codon of the flagellin gene (Table 3B) . PGR 
reactions were carried out under the following conditions: 
denaturing, 94oG/30'; annealing, temperature varies (refer 
to Table 3A)/30'; extension, 72oG/l'; 30 cycles. The PGR 
product was purified using the Promega Wizard PGR 
purification kit (Madison WI USA) before being digested by 
restriction enzymes 7\^coI and BairiHI and cloned into the 
Ncol/BamiiX sites of plasmid pTrc99A. 

Plasmid pTrc99A has a strong trc promoter upstream of 
the polylinker. Downstream of the promoter, it contains 
the ribosome binding site (SD sequence, see Fig 73) which 
is located 8bp upstream of the ATG site within the is^col 
site. The polylinker and the SD sequence of pTrc99A are 
shown in Fig 73 . 

The plasmids generated were given pPR numbers , and 
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they are listed in Table 3A. In these plasmids, the 
expression module consists of the trc promoter, the SD 
sequence, and the full flagellin gene. The sequence of the 
junction region between the SD sequence and the start of 
flagellin gene is shown in Fig 73, 

For flagellin genes from type strains for H6, H7 , H9 , 
HIO, H12, H14, H18, H19 , H20, H26, H28, H31, H41, H43 , 
H45, H46, H49, H51, and H52, we have the full sequence for 
each gene and the primer sequences (#1868 and #1870) are 
conserved among them. The cloned genes therefore have the 
same sequence as those from the type strains. 

For flagellin genes from type strains for HI, H15 and 
H34, we also have the full sequence. The previously 
published sequence of the flagellin gene from the HI type 
strain was extracted from GenBank (accession number 
Ii07387) and used. Primer #1868 is conserved in all three. 
But, primer #187 0 has the third base of the fifth last 
codon in the HI sequence changed from A to G, and the 
third base of the second last codon changed from C to T in 
the H15 and H34 sequences: these changes did not change 
the amino acid coded, so the cloned genes encode the same 
gene products as those of the type strains. 

For flagellin genes from type strains for H2 , H3 , H5, 
Hll, H21, H24, H27, H29, H38, H39, H42, and H56, we do not 
have the full sequences. In the plasmids carrying genes 
from these type strains, the expression module consists of 
the trc promoter, the SD sequence, and the full flagellin 
gene with the first and the last 21 base pairs being 
determined by the primer sequences which are based on the 
H7 flagellin gene of the H7 type strain. The sequence of 
the junction region between the SD sequence and the start 
of flagellin gene is shown in Fig 73. 

9.2. Cloning- of the flagellin gene from type strain of 
H23: 

The full flagellin gene was PGR amplified using 
primers #1868 and #1869 (Table 3A) . #1868 is the 5' 
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primer: there is an Ncol site incorporated into the primer 
(Table 3B) and the flagellin gene starts at base 3 of the 
Ncol site. The 3' primer #1869 has a Sail site 
incorporated downstream of the stop codon of the flagellin 
gene (Table 3B) . PGR reactions were carried out under the 
following conditions: denaturing, 94oC/30'; annealing, 
55OC/30'; extension, 72oC/l'; 30 cycles. The PGR product 
was purified using the Promega Wizard PGR purification kit 
(Madison WI USA) before being digested by restriction 
enzymes Ncol and Sail and cloned into the iVcoI/Sall sites 
of plasmid pTrc99A to give plasmid pPR1942 . 

Plasmid pTrc99A has a strong trc promoter upstream of 
the polylinker. Downstream of the promoter, it contains 
the ribosome binding site (SD sequence, see Fig 73) which 
is located 8bp upstream of the ATG site within the Ncol 
site. The polylinker and the SD sequence of pTrc99A are 
shown in Fig 73. 

The expression module of pPR1942 consists of the trc 
promoter, the SD sequence, and the full flagellin gene. 
The sequence of the junction region between the SD 
sequence and the start of flagellin gene is shown in Fig 
73 . 

5.3. Cloning^ of flagellin genes from type strains of H30, 
H32 and H33 : 

The full flagellin gene was PGR amplified' using 
primers #1868 and #1871 (Table 3A) . #1868 is the 5' 
primer: there is an Ncol site incorporated into the primer 
(Table 3B) and the flagellin gene starts at base 3 of the 
i\rcoI site. The 3' primer #1871 has a PstI site 
incorporated downstream of the stop codon of the flagellin 
gene (Table 3B) . PGR reactions were carried out under the 
following conditions: denaturing, 94oG/30'; annealing, 
temperature varies (refer to Table 3A)/30'; extension, 
72oG/l'; 30 cycles. The PGR product was purified using the 
Promega Wizard PGR purification kit (Madison WI USA) 
before being digested by restriction enzymes A^col and PstI 
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and cloned into the Ncol/ Pstl sites of plasmid pTrc99A. 

Plasmid pTrc99A has a strong trc promoter upstream of 
the polylinker. Downstream of the promoter, it contains 
the ribosome binding site (SD sequence, see Fig 73) which 
is located 8bp upstream of the ATG site within the NcoZ 
site. The polylinker and the SD sequence of pTrc99A are 
shown in Fig 73. 

For flagellin genes from type strains for H30 and 
H32, we have the full sequence. Primer #1868 sequence is 
conserved in both of them. But, primer #1871 has the third 
base of the fourth last codon in both sequences changed 
from G to A to remove a Pstl site (see Table 3B) : this 
change did not change the amino acid coded. The expression 
module consists of the tjrc promoter, the SD sequence, and 
the full flagellin gene coding for a gene product which is 
same as that of the type strain. The sequence of the 
junction region between the SD sequence and the start of 
flagellin gene is shown in Fig 73. 

We do not have the full sequence for the flagellin 
gene from the H33 type strain. In the plasmid containing 
the H33 type strain gene, the expression module consists 
of the trc promoter, the SD sequence, and the full 
flagellin gene with the first and the last 21 base pairs 
been determined by the primer sequences which were used 
for the cloning of H30 and H32 . The sequence of the 
junction region between the SD and the start of flagellin 
gene is shown in Fig 73 . 

9.4. Flagellin genes from type strains for H4 and H17 : 

For the flagellin genes of H4 and H17 type strains 
the full sequence was not obtained, and the sequenced 
parts were PGR amplified and cloned into plasmid pPR1951 
to give in each case a gene in which the first 26 and the 
last 31 codons are based on the sec[uence of the H7 
flagellin gene of the H7 type strain. 

9,4.1 Construction of expression plasmid vector 
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PPR1951 : 

The first 26 codons of the H7 flagellin gene was 
first PGR amplified using primers #1868 and #1872 (Table 
3B) . #1868 is the 5' primer: there is an NcoT site 
incorporated into the primer (Table 3B) and the flagellin 
gene starts at base 3 of the i\^coI site. Primer #1872 was 
made to have the last two codons (codons 25 and 26) 
changed from CTG TCG (Leucine and Serine) to GGA TCC 
(Glycine and Serine) to generate a BamEl site. This PGR 
fragment was digested with i\^coI and BamHI before being 
cloned into the i\7coI/BamHI sites of pTrc99A to make 
plasmid pPR1949. 

The last 31 codons (including the stop codon) of the 
H7 flagellin gene was PGR amplified using primers #1884 
and #1871 (Table 3A) . The 5' primer, #1884, has the first 
two of the 31 codons changed from TCG AAA (Serine and 
Lysine) to TCT AGA (Serine and Arginine) to generate a 
Xbal site (Table 3B) . The 3' primer #1871 has a PstI site 
incorporated downstream of the stop codon (Table 3B) . This 
PGR fragment was digested with Xbal and PstI, and then 
cloned into the Xbal/PstI sites of pPRl949 to make plasmid 
PPR1951. 

9-4.2 Cloning of flagellin genes from the H4 and 

HI 7 type strains: 

The central regions of flagellin genes from type 
strains H4 and H17 were PGR amplified using primers #1878 
and #1885 (Table 3B) , which have a BamHI and a Xbal 
incorporated at their ends respectively. PGR reactions 
were carried out under the following conditions: 
denaturing, 94oC/30'; annealing, 65oC/30'; extension, 
72oG/l'; 30 cycles. The PGR product was purified using the 
Promega Wizard PGR purification kit (Madison WI USA) 
before being digested by restriction enzymes BamHI and 
Xbal and cloned into the Xhal/BamHl sites of plasmid 
PPR1951 to make plasmids pPR1955 (H4) and pPR1957 (H17) . 

The expression module of plasmids pPR1955 and pPR1957 
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consists of the trc promoter, the SD sequence, the first 
24 codons of the H7 flagellin gene (of the H7 type 
strain) , 2 codons encoding Glycine and Serine, 292 or 293 
codons of the central region based on the flagellin gene 
obtained from the H4 or H17 type strain respectively, 2 
codons encoding Serine and Arginine, and then the last 29 
codons of the H7 flagellin gene (of the H7 type strain) . 

10. Expression of flagellin gene plasmids in E. coli 
strains lacking the flic gene, and identification of the H 
antigens encoded by these plasmids : 

Plasmids carrying flagellin genes as described in 
section 9 (see Table 3A for a list) were electroporated 
into strains M2126 or P5560. Strains M2126 and P5560 do 
not have functional flic genes, and are not motile when 
examined under a microscope. Trans foionants carrying any of 
the plasmids listed in Table 3A are motile when examined 
under a microscope. Thus, the flagellin genes in all of 
the plasmids are expressed. 

The antigenic specificity of the flagellin of each 
transformant was then determined by slide agglutination. 

^0.1 Flagellin genes from type strains for H2, 

H5, H6, H7, H9, Hll , H14, HIS, H18, H19 , H20, H21, H23 , 
H24, H26, H27, H28, H29 , H30, H31, H32, H33, H34, H39, 
H41, H42, H43, H45, H46, H49 , H51, H52 , and H56: 

As shown in Table 3A, strains with plasmids carrying 
these flagellin genes expressed the same H antigen as 
their respective flagellin parent strain. 

For flagellin specificities H2 , H5, H6 , H7 , H9 , H14, 
H15, H18, H19, H20, H23, H24, H26, H27, H28, H29, H31, 
H33, H39, H51, H52 , and H56, there was no cross reaction 
reported between these flagellins and flagellin antisera 
for other H antigens [Ewing, W. H. : Edwards and Ewing's 
identification of the Enterobacteriaceae . , Elsevier 
Science Publishers, Amsterdam, The Netherlands, 1986], and 
we conclude that we have in each case sequenced the gene 
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encoding the flagellin of the expected specificity from 
the respective type strain. 

It has been observed that cross reactions exist 
between some type strains and certain antisera at 
different levels of dilution (of the antisera) , being Hll 
with anti-H21 and anti-H40, H21 with anti-Hll, H30 with 
anti-H32, H32 with anti-H30, H34 with anti-H24 and anti- 
H31, H41 with anti-H37 and anti-H39, H42 with anti-H6, H43 
with anti-H37, H45 with anti-H20, H46 with anti-H17, and 
H4 9 wi th ant i -H3 9 [ Ewing , W . H . : Edwards and Ewing ' s 
identification of the Enterohacteriaceae . , Elsevier 
Science Publishers, Amsterdam, The Netherlands, 1986] . We 
have tested strain M212 6 or strain P5560 carrying plasmids 
containing flagellin genes obtained from each of these 
type strains (Hll, H21, H3 0, H32, H34, H41, H42, H43 , H45, 
H46, and H49) with the appropriate cross-reacting 
antisera. 

For strain M212 6 or strain P5560 carrying plasmids 
containing flagellin genes obtained from type strains Hll, 
H34, H41, H42, H43 , H45, H46, and H49, no cross reaction 
was found. We conclude that we have in each case sequenced 
the gene coding the flagellin of the expected specificity 
from the respective type strain. 

Cross reaction was observed for strain P5560 carrying 
plasmid pPR1948 (containing the flagellin gene obtained 
from the H30 type strain) with anti-H32 ser\am, strain 
P5560 carrying pPR1940 (containing the flagellin gene 
obtained from the H32 type strain) with anti-H30 serum , 
and strain M2126 carrying plasmid pPR1995 (containing the 
flagellin gene obtained from the H21 type strain) with 
ant i -Hll serum. 

We note that the reported cross reactions between the 
H30 type strain and anti-H32, the H32 type strain and 
anti-H3 0, and the H21 type strain and anti-Hll happened at 
a higher level of dilution (of antisera) than for all 
other type strains with the antisera mentioned above 
[Ewing, W. H.: Edwards and Ewing 's identification of the 
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Enterohacterlaceae. , Elsevier Science Publishers, 
Amsterdam, The Netherlands, 1986]. We conclude that except 
for these three cases, the antiserum used were supplied at 
a dilution which did not exhibit cross reactions. For the 
three strains carrying flagellin genes cloned form type 
strains for H30, H32, and H21, it was necessary to further 
dilute the antiserum. 

Strain P5560 carrying plasmid pPR1948 (containing the 
flagellin gene obtained from the HBO type strain) 
agglutinates with ant i -HBO serum when the antiserum is 
diluted to 1:60, but agglutinates with anti-HB2 serum only 
at a dilution of 1:10 and not at a 1:20 dilution (note 
that the antisera we used have been diluted before 
reaching our hands) . in contrast, strain P5560 carrying 
plasmid PPR1940 (containing flagellin gene obtained from 
the H32 type strain) agglutinates with anti-H32 serum when 
the antiserum is diluted at 1:100, but agglutinates with 
anti-HBO serum only at a 1:10 dilution and not at a 1:10 
dilution. Thus, we conclude that the flagellin genes we 
sequenced from type strains for H30 and H32 encode 
flagellins of HBO and H32 specificities respectively. 

Strain M212 6 carrying plasmid pPR1995 (containing the 
flagellin gene obtained from the H21 type strain) 
agglutinates with anti-H21 serum when the antiserum is 
diluted to 1:40, but agglutinates only with undiluted 
anti-Hll serum and not at a 1:10 dilution (note that the 
antisera we used have been diluted before reaching our 
hands) . In contrast, strain M2126 carrying plasmid pPR1981 
(containing flagellin gene obtained from the Hll type 
strain) did not agglutinate with anti-H21 serum. Thus, we 
conclude that the flagellin genes we sequenced from type 
strains for H21 encodes flagellin of H21 specificity. 

"^^"•^ ■^l^Sellin genes from type strains of HI and 

H12: 

These two genes are very similar in sequence, with 8 
a. a difference between the gene products, it has been 
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known that some cross-reaction exists between anti-H12 
serum and the HI type strain and between ant i -HI sernm and 
the H12 type strain [Swing, W. H. : Edwards and Ewing's 
identification of the Enterojbacteriaceae. , Elsevier 
Science Publishers, Amsterdam, The Netherlands, 1986] . 
Strain M2126 carrying pPR1920 (carrying a flagellin gene 
from the HI type strain, Table 3A) agglutinates with anti- 
Hl serum when the antiserum is diluted to 1 : 100 , but 
agglutinates only with undiluted anti-H12 serum and not at 
a 1:10 dilution (please note that the antisera we used 
have been diluted before reaching our hands) . In contrast, 
strain M2126 carrying plasmid pPR1990 (carrying a 
flagellin gene from the H12 type strain, Table 3 A) 
agglutinates with anti-Hl2 serum when the antiserum is 
diluted at 1:100, but agglutinates only with undiluted 
anti-Hl serum and not at a 1:10 dilution. Thus, we 
conclude that the flagellin genes we sequenced from type 
strains for HI and H12 encode flagellins of HI and H12 
specificities respectively. 

10.3. Flagellin gene coding for HIS: 

Strain P5560 carrying plasmid pPR1969 agglutinated 
with anti-H16 serum. pPR1969 carries a flagellin gene 
amplified from the H3 type strain. It has been shown that 
this H3 type strain is a biphasic strain which can express 
H3 and H16 specificities [Ratiner, Y. A. (1985) "Two 
genetic arrangements deteirmining flagellar antigen 
specificities in two diphasic E. coli strains. FEMS 
Microbiol Lett 19: 317-323]. Thus, the H3 type strain has 
two flagellin genes coding for H3 and H16 specificities. 
We conclude that we have cloned and sequenced the H16 
flagellin gene from this H3 type strain. 

10.4 Flagellin gene coding for H4 : 

The flagellin genes obtained from type strains for H4 
and H17 are nearly identical, with 4 a. a. difference in 
the gene products, Plasmid pPRl955 carries a flagellin 
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gene from the H4 type strain, and plasmid pPR1957 carries 
a flagellin gene from the H17 type strain. Strain P5560 
carrying plasmid pPR1955 or plasmid pPR1957 agglutinated 
with anti-H4 serum, but not with anti-H17 serum. It has 
been shown that the type strain for H17 is a biphasic 
strain which can express H17 and H4 [Ratiner, Y. A. (1985) 
"Two genetic arrangements determining flagellar antigen 
specificities in two diphasic E. coli strains. FEMS 
Microbiol Lett 19: 317-323]. The flagellin gene obtained 
from type strain for H44 is also highly similar to that 
obtained from the H4 type strain, with 2 a. a. difference 
in the gene products. It has been shown that the H44 type 
strain has two complete flagellin genes, being H4 and H44 
[Ratiner, Y. A. (1998) «New flagellin specifying genes in 
some E. coli strains" J. Bacterid 180: 979-984]. Thus, we 
conclude that all the three flagellin genes (obtained from 
type strains for H4, H17 and H44, and sequenced) encode 
the H4 flagellin, and that the flagellin genes for H17 and 
H44 specificities have not yet been cloned. 

^^'^ Flagellin gene coding for HIO: 

The flagellin genes obtained from type strains for 
HIO and H50 are nearly identical, with 3 a. a. difference 
in the gene products. Strain P5560 carrying plasmid 
PPR1923 (which carries a flagellin gene from the HID type 
strain) agglutinated with anti-HlO serxim. We conclude that 
the sequence obtained from the HIO type strain encodes the 
HIO flagellin. It is not clear if the sequence obtained 
from the H50 type strain encodes HIO or H50 (see below 
section for H50) . 



^^•^ Flagellin gene coding for H38: 

The flagellin genes obtained from type strains for 
H3 8 and H55 are nearly identical, with only 1 a. a. 
difference in the gene products. Strain M2126 carrying 
plasmid PPR1984 (carrying the flagellin gene from the type 
strain H38) agglutinated with anti-H3 8 serum, but not with 
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anti-H55 serum. It also has been shown that the type 
strain for H55 has two complete flagellin genes coding for 
H55 and H38 specificities [Ratiner, Y. A. (1998) "New 
flagellin specifying genes in some E. coli strains" J. 
Bacterid 180: 979-984]. Thus, we conclude that both 
cloned genes encode the H3 8 flagellin. 

10 .7 Summary: 

Flagellin genes coding for 39 H antigens have been 
identified, being those for specificities HI, H2 , H4, H5, 
H6, H7, H9, HIO, Hll, H12 , H14 , HIS, H16, H18, H19, H20, 
H21, H23, H24, H26, H27, H28, H29, H30, H31, H32, H33, 
H34, H38, H39, H41, H42, H43 , H45, H46, H49, H51, H52 , and 
H56. 

11. Comparison and alignment of the flagellin genes: 

Programs Pileup [Devereux, J., Haeberli, P. and 
Smithies, O. : A comprehensive set of sequence analysis 
programs for the VAX. Nucl, Acids Res. 12 (1984) 3 87- 
395]and Multicomp [Reeves, P.R. , Farnell, L. and Lan, R. : 
MULTICOMP: a program for preparing sequence data for 
phylogenetic analysis. CABIOS 10 (1994) 2 81-284] were 
used. 

The previously published sequence of HI (GenBanlc 
accession number L07387) was extracted from GenBanlc and 
used. Because we did not sequence H3 6 and H53 flagellin 
genes and we did not have the HI 6 type strain, we only 
compared 51 flagellin genes of H type strains and the fliC 
genes from the additional 10 H7 strains. 

Among the H7 flic genes, the percentage of DNA 
difference ranged from 0.0 to 2.39%, The flagellin genes 
from type strains for H40 and H8 are identical. Some 
others are nearly identical: H21 and H47 (1.5% 
difference), H12 and HI (2.6% difference), HIO and H50 
(0.3% difference), H38 and H55 (0.1% difference), H4 , H44 
and H17 are " very similar, the pairwise difference ranging 
from 0.33% to 0.87%. 
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For the flagellin genes obtained from type strains 
for H4, HI 7 and H44, we have shown that all the three 
genes encode flagellin with the H4 specificity (see 



strains fro H21 and H47, and H38 and H55, we have 
confirmed the specificities for one for each pair and have 
good reason to conclude that both genes of each pair 
encode the same H specificity (see above section) , being 
that for H21 and H3 8 specificities respectively. 

For the flagellin genes obtained from type strains 
for HIO . and H50, we have confirmed that the one from the 
HIO type strain encodes HlO specificity. As these two 
genes are highly similar, we have presximed that they both 
encode HIO specificity. 

In the cases where the flagellin gene from two type 
strains is near identical, we conclude that both genes 
code for flagellin of the same H specificity and that one 
or other strain has an additional locus which carries the 
functional gene, although the flagellin genes sequenced 
do not appear to be mutated. 

We have shown by cloning and expression that the 
flagellin genes obtained from the HI and H12 type strains 
encode HI and H12 specificities respectively (see above 
section) . The neucleotide difference between these two 
genes is higher at 2.6% (see above), but still within the 
normal range for variation within a gene in E, coli. The 
two antigens cross react, and this cross reaction must be 
due to the high level similarity of the flagellins 
encoded by these two genes . 

As discussed above, genes encoding some H antigens 
have been shown to be located at loci other than flic. H3 , 
H3 6, H47, H53 have been shown to be at a locus called 
flkA, H44 and H55 at fllA, and H54 at flmA [Ratiner Y A 
(1998) "New f lagellin-specifying genes in some Escherichia 
coli strains" J. Bacteriol. 180 979-984]. However, these 
strains may carry a flic in addition to flkA, fllA or flmA 
[Ratiner Y A (1998) "New f lagellin-specifying genes in some 



above) . 



For the flagellin genes obtained from type 
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Escherichia coli strains" J. Bacteriol. 180 979-984]. 

The flagellin gene encoding H48 was previously 
sequenced from E. coli strain K-12 [Kuwajima G, Asaka J, 
Fujiwara T, Node K and Kondo E ''Nucleotide sequence of the 
hag gene encoding flagellin of Escherichia coli'' J 
Bacteriol. 168 (1986) 1479-1483]. We have sequenced the 
flic gene from the H48 type strain, and found that it is 
identical to that from K-12. 

The H54 gene is Icnown to be at flmA [Ratiner Y A 
(1998) "New f lagellin-specif ying genes in some Escherichia 
coli strains" J. Bacteriol. 180 979-984] 

and the finding of a non-functional presiimptive fliC locus 
in the H54 strain shows that it is present but not 
expressed. However, we have not amplified and sequenced 
the functional flmA gene of this strain. 

Using the 43 unique sequences (being the 3 9 
identified genes with confirmed specificities and the 
flagellin genes obtained from the H8 (or H40), H25, H37, 
and H48 type strains) and the sequences from the two non- 
functional flagellin genes (from H type strains H35 and 
H54) (see Table 3) we have been able to determine antigen 
specific primers for each of the H antigen specificities 
and thereby show that it is practicable to detect E.coli 
strains carrying specific H antigens without false 
positives from strains of other H types. There is no 
reason to expect that the addition of 11 sequences to the 
43 unique sequences obtained will affect the general 
conclusion, as unlilce previous reports, our study covers 
flagellin sequences for a substantial majority of Icnown E. 
coli H antigen specificities. 

Our study of 11 H7 genes from strains of eight 
different O antigens shows limited variation which was 
such that the variation within genes for H antigens does 
not affect the ability to select antigen specific primers. 
0:H combinations in general define a strain and as some of 
the strains thus defined were quite distant from each 
other in a study by Whittam [Whittam T S, wolfe M L, 
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Wachsmuth I K, Orskov I and Wilson R A Clonal 
relationships among Escherichia coli strains that cause 
hemorrhagic colitis and infantile diarrhea" Infect. Immun. 
61 (1993) 1619-1629] the variation we observe is thought 
to represent that generally present in H7 genes. We also 
obtained more than one sequences for flagellin genes for H 
specificities H4 , HIO, and H3 8, and again the level of 
variation within a given specif ities is very low. 
However, there is a low possibility that primers chosen 
without knowledge of the variation within genes of each H 
specificity could fail to give positive results with some 
isolates due to chance choice of primers which cover a 
base or bases which contribute to this low level 
variation. The variation within the H7 genes is in the 
normal range for variation within a gene in E. coli and if 
this possibility did occur it would be easy to use an 
alternate primer pair. For example, if a first primer in 
a primer pair is unable to hybridise to a target region 
because of low level variation in that region, a positive 
result may be achieved by using a second primer in that 
pair together with a third primer, whether or not the 
third primer is specific for the flagellin gene. Where 
the third primer is not specific for the flagellin gene, 
the specificity of the primer pair derives from the 
specificity of the second primer. The observation that 
the overall level of variation within gene for a given H 
specificity is very low making it extremely unlikely that 
the regions covered by the two primers specific for H 
specificity would both have undergone change in the same 
strain. 

There are 54 known H antigens for E. coli and of 
these there are 11 H antigen specificities for which we do 
not as yet have sequence. It will be easy to determine 
these sequences and determine primer pairs specific for 
these H antigens by comparing these sequences with the 45 
obtained sequences (see Table 3), and also modify the 
primers selected for any H antigen for which we already 
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know the seqijence in the unlikely event that there is a 
possibility of false positives with the primers selected. 

The sequences for the remaining H antigens can be 
obtained in one of the following ways : 

1. where we have two bands by PGR (H3 6 and H53 type 
strains) , we purify each and sequence, and also clone each 
into a strain mutated in its fliC gene and determine the H 
antigen expressed by use of specific sera. In this way a 
specific sequence can be related to an H antigen 
specificity. The other band which represents an H antigen 
gene for a different specificity is expected to include a 
mutant gene or a gene similar to one of those for a known 
H specificity, but if not may represent a new specificity 
for which primer pairs could be selected. It may be 
difficult to obtain expression of flagellin genes when 
cloned from E. coli due to cloning together with 
regulatory sequences which prevent expression. This is 
easily avoided by cloning the major segment of the gene 
into a functioning fliC gene to replace the equivalent 
segment of that gene, using standard site directed 
mutagenesis to give suitable restriction sites within the 
cloned gene and incorporating those restriction sites into 
primers used to amplify the major segment of the gene to 
be studied to facilitate the cloning. We have cloned and 
sequenced the PGR bands from the H3 6 and the H55 type 
strains using this method (see section 16) . 

2. Where two or more strains have the same flagellin 
gene sequence, the genes are cloned as above and the H 
antigen specificity represented by this sequence is 
determined. This identifies the strain in which the 
expected gene is expressed and also those strains for 
which we have sequenced a gene which is not being 
expressed. We then clone the gene for the antigen 
expressed in these strains by making a bank of plasmid 
clones using chromosomal DNA and select for a clone which 
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is expressing an H antigen different from the one 
represented by the known sequence. This can be done by 
taking advantage of the fact that the H antigen is on 
flagellin, the protein of the bacterial flagellum used for 
movement of the bacteria. In the presence of antibodies 
specific to that flagellum the bacteria cannot swim. For 
selection the clones are placed in a situation in which 
motile cells can swim away from the others and be 
collected. There are many versions of these techniques 
and any could be used. One version is to place the 
bacteria on a nutrient agar plate with reduced agar 
content such that bacteria can swim away from the site of 
inoculation. This is easily seen as growth on the plate 
and a sample of the bacteria which are motile can be 
recovered and cultivated. In this way bacteria carrying 
cloned H antigen genes can be selected. If the medium in 
the plate has antibody added to it only bacteria which 
express an H antigen different to that recognised by the 
antiserum will be able to swim. Specifically if the 
antiserum used is specific for the H antigen expressed by 
the gene for which we have sequence, only clones which 
express a different H antigen, such as those expressing 
the H antigen expressed by the H type strains used to make 
the plasmid, will be selected. Once the clone is 
obtained, the H antigen gene can be sequenced. 

Our work has shown that there are at least 7 cases 
where the H antigen type strains carry two H antigen genes 
which appear to be complete and have the potential to 
function. However, while E. coli does not (in general) 
have a capacity to express more than one flagellin gene, 
it is striking that there are several loci for flagellin 
genes [Ratiner Y A (1998) "New f lagellin-specifying genes 
in some Escherichia coli strains" J. Bacterid. 180 979- 
984]. Several of the pairs of H type strains with 
identical or near identical sequence do not include any of 
the H antigen types shown by Ratiner [Ratiner Y A 
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(1998) "New f lagellin-specif ying genes in some Escherichia 
coli strains" J. Bacteriol . 180 979-984] to map other than 
at flic although these predominate. This suggests that 
there are additional cases where the expressed gene is not 
the only flagellin gene present. However the fact that 
many of the cases where we obtained flagellin genes of 
identical or near identical sequence and/or two flagellin 
genes from one strain involve type strains found by 
Ratiner [Ratiner Y A (1998) "New f lagellin-specifying genes 
in some Escherichia, coli strains" J. Bacteriol. 180 979- 
984] to map away from fliC are among those near identical 
to others, indicates that the phenomenon is of limited 
extent. Nonetheless it remains possible even where only 
one gene has been obtained by PGR, that it is one of a 
pair of flagellin genes, the other not being amplified by 
the primers used, and further that it is the one not 
amplified which is expressing the H antigen of the strain. 

It will therefore be necessary to clone as described 
above each of the flagellin genes we have sequenced and 
confirm that it expresses the expected antigen to ensure 
that the invention give results corresponding to those of 
the traditional serotyping scheme. In the event that it 
does not, the gene for the type antigen can be cloned and 
sequenced by the means described above. 

The 11 H7 flic sequences fell into three groups, one 
comprising the genes from the 0157:H7 and 055 :H7 strains, 
which were identical, as expected given the proposed 
relationship between the clones. It has been shown that E. 
coli 0157:H7 and 055:H7 clones are closely related 
[Whittam T S, wolfe M L, Wachsmuth I K, Orskov I and 
Wilson R A "Clonal relationships among Escherichia coli 
strains that cause hemorrhagic colitis and infantile 
diarrhea" Infect. Immun. 61 (1993) 1619-1629] thus it was 
expected that the H7 flic genes from 0157 and 055 would be 
identical. Among the H7 flic sequences, we can identify 
primers specific to the H7 fliC gene for each of the three 
H7 groups. Two of these primers in combination with an H7 
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specific primer gave two primer pairs specific for the H7 
gene of from the 0157 :H7 and 055 :H7 clones. 

13. Specific oligonucleotide primers for each of the 43 

flagellin genes 

Two oligonucleotide primers were chosen based on each 

of the 43 sequences. None of them had more than 85% 

identity with any other of 61 flagellin gene sequences. 

Thus, these primers are specific for each H type. These 

primers are listed in Table 3 . 

The flagellin gene of the H54 type strain is a 

mutated gene. It has an insertion sequence (IS1222) 
inserted into a normal flagellin gene of H21. Thus, 

primers for H21 would amplify a fragment of different size 
in H54 . We also provide 2 primers based on the insertion 
sequence (see H54 row in Table 3), and the use of one of 
them in combination with one of the H21 primers will 
generate a PGR band only in H54, which will also 
differentiate those strain carrying the mutated H21 gene 
from those expressing the H21 flagellin gene. 

The flic gene of H35 type strain is also a mutated 
gene . It has an insertion sequence ( ISl ) inserted into a 
normal flagellin gene of Hll . Thus, primers for Hll would 
amplify a fragment of different size in H35. We also 
provide 2 primers based on the insertion sequence (see H35 
row in Table 3 ) , and the use of one of them in combination 
with one of the Hll primers will generate a PGR band only 
in H35, which will also differentiate those strain 
carrying the mutated Hll gene from those expressing the 
Hll flagellin gene. 

14. Testing of the H7 specific oligonucleotide primers 

Primer pair #1806/#1809 (see Table 3) was used to 
carry out PGR on chromosomal DNA samples of all the 54 H 
type strains and the H7 strains listed in Table 1. PGR 
reactions were carried out under the following conditions: 
denaturing, 94°G/30'; annealing, 58°G/30'; extension, 
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72^0/1'; 30 cycles. PGR reaction was carried out in an 
volume of 50ul for each of the chromosomal sample. After 
the PGR reaction, 5|Il PGR product from each sample was run 
on an agarose gel to check for amplified DNA. 

Primer pairs #1806/ #1809 produced a band of predicted 
size with all the 11 strains expressing H7, but gave no 
band with other H type strains. Thus, these primers are H7 
specific , 



15. Testing of oligonucleotide primers specific to H7 of 
0157 and 055 : 

Based on a comparison of the flic sequences of 11 
different H7 strains, we have identified two 
oligonucleotides [#1696 (5 ' -GGCCTGACTCAGGCGGCC ) at 
positions 178 to 195 in M527 and #1697 (5'- 
GAGTTACCGGCCTGCTGA) positions 1700-1683 in M527] which are 
unique to H7 of 0157 and 055. Although not identical to 
any parts of the fliC sequences of any other H7 strains, 
these two primers are identical or have high level 
similarity to flic genes of some other H types. However a 
combination of one of these primers with one of the H7 
specific primers can give specificity for H7 of 0157:H7 
and 055:H7 E, coli. 

Primer pairs #1696/#1809 and #1697/#1806 were used to 
carry out PGR on chromosomal DNA samples of all the H 
type strains and the H7 strains listed in Table 1. PGR 
reactions were carried out under the following conditions: 
denaturing, 94OG/30*; annealing, 61°G/30' (for 

#1696/#1809) or 60OG/30 ■ (f or#1697/#1806) ,- extension, 
720G/1'; 30 cycles. PGR reaction was carried out in an 
volxxme of 50p.l for each of the chromosomal samples. After 
the PGR reaction, 5|Lil PGR product from each sample was run 
on an agarose gel to check for amplified DNA, 

Both primer pairs produced a band of predicted size 
with both of the 0157:H7 strains (strains M1D04 and M527, 
see Table 1), and the 055:H7 strain (strain M1686, see 
Table 1), but gave no band with other strains. Thus, these 



# 




wo 99/61458 



PCT/AU99/00385 



- 44 - 



two pairs of primers are specific to H7 genes of 0157 : H7 
and 055 :H7 E. coll strains. 

16. Identification of flagellin genes for the remaining 15 
H specificities . 

16.1. Sequencing the potential flkA gene coding for the 
H36 flagellin: 

Using primers #1431 (5'- atg gca caa gtc att aat ace 
caa c) and #1432 (5'- eta acc ctg cag cag aga ca) , we have 
amplified two bands from the H36 type strain. PGR reaction 
was carried out under the following conditions: 
denaturing, 94oC/30'; annealing, 57oC/30'; extension, 
72oC/l'; 30 cycles. These two PGR fragments were then 
cloned into the pGEM-T vector using the Pr omega pGEM-T 
cloning kit (Madison WI USA) to make plasmids pPR1992 and 
PPR1993. Inserts from both plasmids were first sequenced 
using the M13 universal primers (which bind to the pGEM-T 
DNA flanking the insertion site) . For pPR1992, primers 
based on the sequence obtained were then used to sequence 
further, and this procedure was repeated until the insert 
was fully sequenced. 

The sequence of the insert of pPR1992 is identical to 
that of the H12 flagellin gene sequence except perhaps for 
the first 8 and last 7 codons which are encoded by the PGR 
primers in plasmid pPR1992 . We have only sequenced the 
two ends of the insert of plasmid pPR1993 (Figures 71 and 
72) , and the sequences of the two ends of the insert of 
pPR1993 are very similar to ends of other sequenced 
flagellin genes. We conclude that the insert of plasmid 
pPR1993 encodes a flagellin gene. The full sequence of 
the insert of plasmid pPR1993 can be obtained using the 
same method as for the sequencing of the insert of plasmid 
pPR1992. It is known that flkA gene encodes the H36 

flagellin [Ratiner, Y. A. (1998) ''New flagellin specifying 
genes in some E. coli strains" J. Bacterid 180: 979-984], 
and it is highly likely that plasmid pPR1993 contains the 
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flkA gene of the H3 6 type strain, H specificities can be 
confirmed by slide agglutination. 

The currently uncharacterised sequence of both ends 
and of DNA flanking these two sequenced genes can be 
obtained by PGR walking and sequencing. Methods for PGR 
walking from a known sequence to an unknown region in 
chromosomal DNA are available (see [Siebert, P. D. , A. 
Chenchi, D. E. Kellogg, A. Lukyanov and S. A. Lukyanov 
(1995) '"An improved PGR method for walking in uncloned 
genomic DNA." Nuc . Acids Res. 23: 1087-1088]), 

The sequenced genes then can be PGR amplified and 
cloned using the method(s) described in section 9. 
Flagellins expressed by strain M2126 carrying these 
plasmids then can be determined by use of specific sera. 

The sequences flanking the flkA gene can then be used 
to PGR amplify other flkA genes (see below) . 

16.2 The flkA genes coding for H3 , H47 and H53 : 

It has been shown that flagellins H3 , H47 and H53 are 
encoded by flkA genes in the type strains [Ratiner, Y. A. 
(1998) "New flagellin specifying genes in some E. coli 
strains" J. Bacterid 180: 979-984]. These genes can be 
PGR amplified using primers based on the sequences 
flanking the flkA gene in the H36 type strain. These PGR 
fragments can then be sequenced, and the genes expressed 
in strain M2126 for the identification of these genes. 

16.3 The fllA genes coding for H44 and H55: 

It is Icnown that flagellins H44 and H55 are coded by 
fllA genes. 

16.3.1 The H55 flagellin gene: 

Using primers #1868 and #1870 (Table 3B) , we have 
amplified two bands from the H55 type strain. PGR reaction 
was carried out under the following conditions: 
denaturing, 94oG/30'; annealing, 50oG/30*; extension, 
72oC/l*; 30 cycles. Tliese two PGR fragments were then 
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Cloned into the pGEM-T vector using the Promega pGEM-T 
cloning kit (Madison WI USA) to make plasmids pPR1994 and 
PPR1989. Inserts from both plasmids were first sequenced 
using the M13 universal primers (which bind to the pGEM-T 
DNA flanking the insertion site) . Primers based on the 
sequence obtained were then used to sequence further, and 
this procedure was repeated until both inserts were fully 
or partly sequenced. 

The sequence of the insert of pPR1994 is highly 
similar to that of the flagellin gene of the H38 type 
strain, with 1 amino acid difference in the gene products. 
We have only sequenced the two ends of the insert of 
plasmid PPR1989 (figures 70A and 70B) , and the sequences 
of the two ends of the insert of pPR1989 are very similar 
to ends of other sequenced flagellin genes. We conclude 
that the insert of plasmid pPR1989 encodes a flagellin 
gene. The full sequence of the insert of plasmid pPR1989 
can be obtained using the same method as for the 
sequencing of the insert of plasmid pPR1994. It is known 
that the H55 type strain carries flagellin genes for both 
H38 and H55, and that the H55 flagellin gene is at the 
fllA locus [Ratiner, Y. A. (1998) "New flagellin 
specifying genes in some E. coli strains" j. Bacterid 
180: 979-984]. Thus, it is highly likely that plasmid 
PPR1989 contains the fllA gene of the H55 type strain. 

The currently uncharacterised sequence of both ends 
and of DNA flanking these two sequenced genes can be 
obtained by PGR walking and sequencing. Methods for PGR 
walking from a known sequence to an unknown region in 
chromosomal DNA are available (see [Siebert, p. d. , a. 
Ghenchi, D. E. Kellogg, A. Lukyanov and S. A. Lukyanov 
(1995) "An improved PGR method for walking in uncloned 
genomic DNA." Nuc . Acids Res. 23: 1087-1088]). 

The sequenced genes then can be PGR amplified and 
cloned using the method(s) described in section 9. 
Flagellins expressed by strain M2126 carrying these 
plasmids then can be determined by use of specific sera. 
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16,3.2 The H44 flagellin gene: 

The sequence information for DNA flanking the fllA 
gene in the H55 type strain can then be used to PGR, 
sequence and identify the fllA gene in the H44 type 
strain. 

1^-4 The flmA gene coding for H54: 

This gene can be cloned by making a bank of plasmid 
clones in strain M2126 using chromosomal DNA of the H54 
type strain and selecting for a transformant which is 
motile on an agar plate. This is done by taking advantage 
of the fact that the H antigen is on flagellin, the 
protein of the bacterial flagellum used for movement of 
the bacteria. Strain M212 6 lacks flagellin. Once the 
clone (s) is obtained and identified by use of anti-H54 
serum, the flagellin gene can be sequenced. It is possible 
that clones expressing different flagellin specificities 
can be obtained, and each of them can be identified by 
using different sera. 



^^'5 The flagellin genes obtained from the H3 7 

and H48 type strains: 

We have used primers #1868 and #1869 (both were based 
on the sequence obtained from the H48 type strain, also 
see section 9) and primers #1868 and #1870 (both were 
based on the sequences of the H7 flagellin gene of the H7 
type strain, also see section 9) to PGR amplify and clone 
the sequenced flagellin genes from the H48 and H37 type 
strains respectively. Strain P5560 carrying the plasmid 
containing either the cloned gene was not motile and did 
not react with the appropriate antisera. It is highly 
likely that mutaions have occured due to PGR errors. This 
can be resolved by re-amplification and re-cloning of the 
genes . 

16.6 The flagellin gene obtained from the H25 type 
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Strain : 

The flagellin gene sequence we first obtained from 
the H25 type strain lacks 23 and 21 codons at 5' and 3' 
ends respectively. We could not amplify the full gene from 
the H25 type strain using primers based on the H7 
flagellin gene of the H7 type strain, and it was necessary 
to get the full sequence of this flagellin gene by other 
means . 

We have used primers (#2650: 5' - cag cga tga aat act 
tgc cat and #2648: 5' - caa tgc ttc gtg acg cac) based on 
the genes {fliD and fliA respectively) flanking flic gene 
in E, coli K-12 [Blattner, F. R. , G, I. Plunkett, C. A. 
Bloch, N. T. Perna, V. Burland, M. Riley and et al . (1997) 
""The complete genome sequence of E. Coli Kil2" Science 
277: 1453-1474] and primers (#2658: 5* - gcc tga gtc aga 
act ttg and # 2653 5' - aac ctg tct gaa gcg cag) based on 
the flagellin sequence obtained from the H2 5 type strain 
to PGR amplify both ends of the flagellin gene. The PGR 
product was then sequenced, and we have now obtained the 
full flagellin gene sequence and sequence for the DNA 
flanking the flagellin gene from type strain H25 (Figure 
69) . Now, it is straightforward to PGR amplify, clone and 
express, and identify this gene using the methods 
described in sections 9 and 10. 

^^-'^ The flagellin genes obtained from the H8 

and H40 type strains: 

The flagellin gene sequences obtained from both the 
H8 and H40 type strains lack 18 and 15 codons at 5 ' and 3 » 
ends respectively. We have used primers based on the H7 
flagellin gene of the H7 type strain to PGR amplify and 
clone the full genes from these two strains. Strain M212 6 
carrying plasmid made this way was not motile under 
microscope and did not react with the appropriate 
antisera. This could be due to PGR errors as mentioned in 
section 16.5 or perhaps the first and last few amino acids 
encoded by the primers (based on H7 flagellin gene) are 
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uncompatible in this case. 

The full sequence of the full gene can be obtained 
using method described in section 16.6. The flagellin gene 
can then be PGR amplified, cloned and expressed, and 
identified using the methods described in sections 9 and 
10. 

The gene products of the flagellin genes obtained 
from the H8 and H40 type strains are identical. Thus, one 
of these two H specificities must be encoded by a unknown 
gene, and it can be cloned and identified using the method 
described in the section 16.8, 

16.8 Flagellin genes coding for H17 , H35, and 

H50: 

As mentioned above, the sequenced flagellin genes 
from the H17 and HBO type strains encode H4 and HlO 
specificities respectively. The flagellin gene sequence 
obtained from the H35 strain has a insertion and encodes a 
non- functional gene (see section 8) . Thus, genes coding 
for these flagellins have not been identified, and their 
location is unknown. One can use primers based on DNA 
flanking fliC, fllA, flkA, and flmA to do PGR on the type 
strain for each of the flagellin antigen. PGR products can 
then be sequenced, and possible genes can be cloned, 
expressed and identified then. 

If the target gene is not PGR amplified using primers 
based on sequence of these loci or sequence flanking these 
loci, it can be cloned by making a bank of plasmid clones 
in strain M212 6 using chromosomal DNA of the type strain 
and selecting for a transformant which is motile on an 
agar plate. This is done by taking advantage of the fact 
that the H antigen is on flagellin, the protein of the 
bacterial flagellum used for movement of the bacteria. 
Strain M2126 lacks flagellin. Once the clone (s) is 
obtained and identified by use of antisera, the flagellin 
gene can be sequenced. It is possible that clones 
expressing different flagellin antigens can be obtained. 
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and each of them can be identified by using different 
antisera. Antiserum for HBO can be prepared using 
standard methods [Ewing, W.H.: Edwards and Ewing's 
identification of the Enterobacterlaceae. , Elsevier 
Science Publishers, Amsterdam, The Netherlands, 1986]. 

O euitigen 

Materials and Methods-part 1 

The experimental procedures for the isolation and 
characterisation of the E. coli Olll O antigen gene 
cluster (position 3,021-9,981) are according to Bastin 
D.A., et al. 1991 "Molecular cloning and expression in 
Escherichia coli K-12 of the rfb gene cluster determining 
the O antigen of an E. coli Olll strain". Mol . Microbiol. 
5:9 2223-2231 and Bastin D.A. and Reeves, P.R. 1995 
"Sequence and analysis of the O antigen gene (rfjb) cluster 
of Escherichia coli Olll". Gene 164: 17-23. 

A. Bacterial strains and growth media 

Bacteria were grown in Luria broth supplemented as 
required. 

B. Cosmids and phage 

Cosmids in the host strain x2819 were repackaged in 
vivo. Cells were grown in 250mL flasks containing 30mL of 
culture, with moderate shaking at 30°C to an optical 
density of 0.3 at 580 nm. The defective lambda prophage 
was induced by heating in a water bath at 45°C for 15min 
followed by an incubation at 37°C with vigorous shaking 
for 2hr. Cells were then lysed by the addition of 0 . 3mL 
chloroform and shaking for a further lOmin. Cell debris 
were removed from ImL of lysate by a Smin spin in a 
microcentrifuge, and the supernatant removed to a fresh 
microfuge tube. One drop of chloroform was added then 
shaken vigorously through the tube contents. 

C. DNA preparation 

Chromosomal DNA was prepared from bacteria grown 
overnight at 37 °C in a volume of 30mL of Luria broth. 
After harvesting by centrif ugation , cells were washed and 
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resuspended in lOmL of 50mMTris-HCl pH 8.0. EDTA was 
added and the mixture incubated for 20min. Then lysozyme 
was added and incubation continued for a further lOmin. 
Proteinase K, SDS, and ribonuclease were then added and 
the mixture incubated for up to 2hr for lysis to occur. 
All incubations were at 3 7°C, The mixture was then 
heated to 65°C and extracted once with 8mL of phenol at 
the same temperature. The mixture was extracted once 
with 5mL of phenol /chloroform/iso-amyl alcohol at 4°C. 
Residual phenol was removed by two ether extractions. 
DNA was precipitated with 2 vols, of ethanol at 4*=*C, 
spooled and washed in 70% ethanol, resuspended in l-2mli 
of TE and dialysed. Plasmid and cosmid DNA was prepared 
by a modification of the Birnboim and Doly method 
[Birnboim, H. C. and Doly, J. (1979) "A rapid alkaline 
extraction procedure for screening recombinant plasmid 
DNA" Nucl. Acid Res. 7:1513-1523]. The volume of culture 
was lOmL and the lysate was extracted with 
phenol /chloroform/iso-amyl alcohol before precipitation 
with isopropanol. Plasmid DNA to be used as vector was 
isolated on a continuous caesium chloride gradient 
following alkaline lysis of cells grown in IL of culture. 

D. Enzymes and buffers. 

Restriction endonucleases and DNA T4 ligase were 
purchased from Boehringer Mannheim (Castle Hill, NSW, 
Australia) or Pharmacia LKB (Melbourne, VIC Australia). 
Restriction enzymes were used in the recommended 
commercial buffer. 

E. Construction of a gene bank. 

Individual aliquots of M92 chromosomal DNA (strain 
Stoke W, from Statens Serum Institut, 5 Artillerivej , 
2300 Copenhagen S, Denmark) were partially digested with 
0.2U 5'au3Al for l-15mins. Aliquots giving the greatest 
proportion of fragments in the size range of 
approximately 40-50kb were selected and ligated to vector 
pPR691 previously digested with BamHl and PvuII. Ligation 
mixtures were packaged in \ritziro with packaging extract. 
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The host strain for transduction was x2 819 and 
recombinants were selected with kanamycin. 

F. Serological procedures. 

Colonies were screened for the presence of the 0111 
antigen by immunoblotting . Colonies were grown 

overnight, up to 100 per plate then transferred to 
nitrocellulose discs and lysed with 0 . 5N HCl . Tween 20 
was added to TBS at 0.05% final concentration for 
blocking, incubating and washing steps. Primary antibody 
was E. . coli O group 111 antiserum, diluted 1:800. The 
secondary antibody was goat anti-rabbit IgG labelled with 
horseradish peroxidase diluted 1:5000. The staining 
substrate was 4-chloro-l-napthol . Slide agglutination 
was performed according to the standard procedure. 

G. Recombinant DNA methods. 

Restriction mapping was based on a combination of 
standard methods including single and double digests and 
sub-cloning. Deletion derivatives of entire cosmids were 
produced as follows: aliquots of l.Bmg of cosmid DNA were 
digested in a volume of 20ml with 0.25U of restriction 
enzyme for 5-80min. One half of each aliquot was used to 
check the degree of digestion on an agarose gel. The 
sample which appeared to give a representative range of 
fragments was ligated at 4^0 overnight and transformed by 
the CaCl2 method into JM109. Selected plasmids were 
transformed into sfl74 by the same method. P4657 was 
transformed with pPR1244 by electroporation. 

H. DNA hybridisation 

Probe DNA was extracted from agarose gels by 
electroelution and was nick-translated using [a-32P]- 
dCTP. Chromosomal or plasmid DNA was electrophoresed in 
0.8% agarose and transferred to a nitrocellulose 
membrane. The hybridisation and pre-hybridisation 

buffers contained either 3 0% or 50% formamide for low 
and high stringency probing respectively. Incubation 
temperatures were 42oc and 37°C for pre-hybridisation and 
hybridisation respectively. Low stringency washing of 
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filters consisted of 3 x 20min washes in 2 x SSC and 0.1% 
SDS. High-stringency washing consisted of 3 x 5min 
washes in 2 x SSC and 0.1% SDS at room temperature, a Ihr 
wash in 1 X SSC and 0.1% SDS at 58°C and 15min wash in 

0. 1 X SSC and 0.1% SDS at 58°C. 

1. Nucleotide sequencing of E, coli 0111 O antigen gene 
cluster (position 3,021-9,981) 

Nucleotide sequencing was performed using an ABI 373 
automated sequencer (CA, USA) . The region between map 
positions 3.3 0 and 7.90 was sequenced using 
uni-directional exonuclease III digestion of deletion 
families made in PT7T3190 from clones pPR1270 and 
PPR1272. Gaps were filled largely by cloning of selected 
fragments into M13mpl8 or M13mpl9 . The region from map 
positions 7.90-10.2 was sequenced from restriction 
fragments in M13mpl8 or M13mpl9 . Remaining gaps in both 
the regions were filled by priming from synthetic 
oligonucleotides complementary to determined positions 
along the sequence, using a single stranded DNA template 
in M13 or phagemid. The oligonucleotides were designed 
after analysing the adjacent sequence. All sequencing 
was performed by the chain termination method. Sequences 
were aligned using SAP [Staden, R, , 1982 ^^Automation of 
the computer handling of gel reading data produced by the 
shotgun method of DNA sequencing". Nuc. Acid Res. 10: 
4731-4751; Staden, R. , 1986 ^^The current status and 
portability of our sequence handling software", Nuc. Acid 
Res. 14: 217-231]. The program NIP [Staden, R. 1982 "An 
interactive graphics program for comparing and aligning 
nucleic acid and amino acid sequence". i\^uc. Acid Res. 
10: 2951-2961] was used to find open reading frames and 
translate them into proteins. 

J. Isolation of clones carrying E. coli 0111 O antigen 
gene cluster 

The E. coli O antigen gene cluster was isolated 
according to the method of Bastin D.A. , et al. [1991 
''Molecular cloning and expression in Escherichia coli K- 
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12 of the rfb gene cluster determining the O antigen of 
an E. coli Olll strain". Mol . Microbiol. 5(9), 2223- 
2231] . Cosmid gene banks of M92 chromosomal DNA were 
established in the in vivo packaging strain x2819. From 
the genomic bank, 3.3 x 10^ colonies were screened with 
E.coli 0111 antiseriam using an iromuno-blotting procedure: 
5 colonies (pPR1054, pPR1055, pPR1056, pPRlQSS and 
PPR1287) were positive. The cosmids from these strains 
were packaged in vivo into lambda particles and 
transduced into the E. coli deletion mutant Sfl74 which 
lacks all O antigen genes. In this host strain, all 
plasmids gave positive agglutination with 0111 antiserum. 

An Eco Rl restriction map of the 5 independent cosmids 
showed that they have a region of approximately 11.5 kb 
in common (Figure 1). Cosmid pPR1058 included sufficient 
flanking DNA to identify several chromosomal markers 
linked to O antigen gene cluster and was selected for 
analysis of the O antigen gene cluster region. 
K. Restriction mapping of cosmid pPR1058 

Cosmid pPRlOSS was mapped in two stages. A 
preliminary map was constructed first, and then the 
region between map positions 0.00 and 23.10 was mapped in 
detail, since it was shown to be sufficient for Olll 
antigen expression. Restriction sites for both stages 
are shown in Figure 2. The region common to the five 
cosmid clones was between map positions 1.35 and 12.95 of 
PPR1058. 

To locate the O antigen gene cluster within pPRlOSS, 
PPR1058 cosmid was probed with DNA probes covering O 
antigen gene cluster flanking regions from S. enterica 
LT2 and E.coli K-12 . Capsular polysaccharide (cps) genes 
lie upstream of 0 antigen gene cluster while the 
gluconate dehydrogenase (gnd) gene and the histidine 

(his) operon are downstream, the latter being further 
from the O antigen gene cluster. The probes used were 
PPR472 (3.35kb), carrying the gnd gene of LT2 , pPR685 

(5.3kb) carrying two genes of the cps cluster, cpsB and 
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cpsG of LT2, and K350 (16.5kb) carrying all of the his 
operon of K-12 . Probes hybridised as follows: pPR472 
hybridised to 1 . 55kb and 3.5 kb (including 2.7 kb of 
vector) fragments of Pstl and Hindlll double digests of 
PPR1246 (a HindlU/EcoRl subclone derived from pPRlOSS, 
Figure 2), which could be located at map positions 12.95- 
15.1; pPR685 hybridised to a 4.4 kb £:coRl fragment of 
PPR1058 (including 1.3 kb of vector) located at map 
position 0.00-3.05; and K350 hybridised with a 32kb EcoRl 
fragment of pPR1058 (including 4 . Okb of vector), located 
at map position 17.3 0-45.90. Subclones containing the 
presumed gnd region complemented a gnd'edd' strain 
GB23152. On gluconate bromothymol blue plates, pPR1244' 
and PPR1292 in this host strain gave the green colonies 
expected of a gnd"'edd~ genotype. The his'' phenotype was 
restored by plasmid pPR1058 in the his deletion strain 
Sfl74 on minimal medium plates, showing that the plasmid 
carries the entire his operon. 

It is likely that the O antigen gene cluster region 
lies between gnd and cps, as in other E, coli and S. 
Bnterica strains, and hence between the approximate map 
positions 3.05 and 12.95. To confirm this, deletion 

derivatives of pPR1058 were made as follows: first, 
PPR1058 was partially digested with Hindlll and self 
ligated. Transf ormants were selected for kanamycin 
resistance and screened for expression of 0111 antigen. 
Two colonies gave a positive reaction. EcoRl digestion 
showed that the two colonies hosted identical plasmids, 
one of which was designated pPR123 0, with an insert which 
extended from map positions 0.00 to 23.10. Second 
PPR1058 was digested with Sail and partially digested 
with Xhol and the compatible ends were re- ligated. 
Transformants were selected with kanamycin and screened 
for 0111 antigen expression. Plasmid DNA of 8 

positively reacting clones was checked using EcoRl and 
Xhol digestion and appeared to be identical. The cosmid 
of one was designated pPR1231. The insert of pPR1231 
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contained the DNA region between map positions 0.00 and 
15.10. Third, pPR1231 was partially digested with Xhol , 
self-ligated, and transf ormants selected on 

spectinomycin/ streptomycin plates. Clones were screened 
for kanamycin sensitivity and of 10 selected, all had the 
DNA region from the Xhol site in the vector to the Xhol 
site at position 4.00 deleted. These clones did not 
express the 0111 antigen, showing that the Xhol site at 
position 4.00 is within the O antigen gene cluster. One 
clone was selected and named pPR1288, Plasmids pPR1230, 
PPR1231, and pPR1288 are shown in Figure 2. 

L. Analysis of the E. coli Olll O antigen gene 
cluster (position 3,021-9,981) nucleotide sequence data 

Bastin and Reeves [1995 "Sequence and analysis of 
the O antigen gene (rfjb) cluster of Escherichia coli 0111". 
Gene 164: 17-23] partially characterised the E.coli Olll 
O antigen gene cluster by sequencing a fragment from map 
position 3,021-9,981. Figure 3 shows the gene 

organisation of position 3,021-9,981 of E, coli Olll O 
antigen gene cluster. orf3 and orfS have high level 
amino acid identity with wcaH and wcaG (46.3% and 37.2% 
respectively) , and are likely to be similar in function 
to sugar biosynthetic pathway genes in the E. coli K-12 
colanic gene cluster. orjf4 and orf5 show high levels of 
amino acid homology to manC and manB genes respectively. 

orf7 shows high level homology with rfbH which is an 
abequose pathway gene. orfS encodes a protein with 12 
transmembrane segments and has similarity in secondary 
structure to other wzx genes and is likely therefore to 
be the O antigen flippase gene. 

Materials and Methods-part 2 

A. Nucleotide sequencing of 1 to 3,020 and 9,982 to 
14,516 of the E, coli Olll O antigen gene cluster 

The sub clones which contained novel nucleotide 
sequences, PPR1231 (map position 0 and 1,510), pPR1237 
(map position -300 to 2,744), pPR1239 (map position 2,744 
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to 4,168), PPR1245 (map position 9,736 to 12,007) and 
PPR1246 (map position 12,007 to 15,300) (Figure 2), were 
characterised as follows: the distal ends of the inserts 
of PPR1237, PPR1239 and pPR1245 were sequenced using the 
M13 forward and reverse primers located in the vector. 
PGR walking was carried out to sequence further into each 
insert using primers based on the sequence data and the 
primers were tagged with M13 forward or reverse primer 
sequences for sequencing. This PGR walking procedure was 
repeated until the entire insert was sequenced. pPR1246 
was characterised from position 12,007 to 14,516. The 
DNA of these sub clones was sequenced in both directions. 

The sequencing reactions were performed using the 
dideoxy termination method and thermocycling and reaction 
products were analysed using fluorescent dye and an ABI 
automated sequencer (GA, USA) . 

B. Analysis of the E. coli 0111 O antigen gene cluster 
(positions 1 to 3,020 and 9,982 to 14,516 of Figure 5) 
nucleotide sequence data 

The gene organisation of regions of E. coli Olll O 
antigen gene cluster which were not characterised by 
Bastin and Reeves [1995 "Sequence and analysis of the O 
antigen gene (rfi)) cluster of Escherichia coli 0111." Gene 
164: 17-23] , (positions 1 to 3,020 and 9,982 to 14,516) is 
shown in Figure 3. There are two open reading frames in 
region 1. Four open reading frames are predicted in 
region 2. The position of each gene is listed in Table 
9. 

The deduced amino acid sequence of orfl (wbdH) 
shares about 64% similarity with that of the rfp gene of 
Shigella dysenteriae. Rfp and WbdH have very similar 
hydrophobic ity plots and both have a very convincing 
predicted transmembrane segment in a corresponding 
position. rfp is a galactosyl transferase involved in 
the synthesis of LPS core, thus wbdff is likely to be a 
galactosyl transferase gene. orf2 has 85.7% identity at 
amino acid level to the gmd gene identified in the E. 
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coli K-12 colanic acid gene cluster and is likely to be a 
gmd gene. orf9 encodes a protein with 10 predicted 
transmembrane segments and a large cytoplasmic loop. 
This inner membrane topology is a characteristic feature 
of all known 0 antigen polymerases thus it is likely that 
orfS encodes an 0 antigen polymerase gene, wzy. orflO 
{wbdL) has a deduced amino acid sequence with low 
homology with Lsi2 of Neisseria, gonorrhoeae, Lsi2 is 
responsible for adding GlcNAc to galactose in the 
synthesis of lipooligosaccharide . Thus it is likely that 
WbdL is either a colitose or glucose transferase gene. 
orfll (wbdM) shares high level nucleotide and amino acid 
similarity with TrsE of Yersinia enterocolitica , TrsE is 
a putative sugar transferase thus it is likely that wbdM 
encodes the colitose or glucose transferase. 

In summary three putative transferase genes and an 0 
antigen polymerase gene were identified at map position 1 
to 3,020 and 9,982 to 14,516 of E. coli 0111 O antigen 
gene cluster. A search of GenBank has shown that there 
are no genes with significant similarity at the 
nucleotide sequence level for two of the three putative 
transferase genes or the polymerase gene. Figure 5 
provides the nucleotide sequence of the Olll antigen gene 
cluster . 



Materials and Methods-part 3 

A. PGR amplification of 0157 antigen gene cluster from 
an E. coli 0157:H7 strain (Strain C664-1992, from Statens 
Serum Institut, 5 Artillerivej , 2300, Copenhagen S, 
Denmark) 

E. coli 0157 O antigen gene cluster was amplified by 
using long PGR [Gheng et al . 1994, "Effective 
amplification of long targets from cloned inserts and 
human and genomic DNA" P.N.A.S. USA 91: 5695-569] with 
one primer (primer #412: att ggt age tgt aag cca agg gcg 
gta gcg t) based on the Jumps tart sequence usually found 
in the promoter region of O antigen gene clusters [Hobbs, 



wo 99/61458 



- 59 - 



PCT/AU99/00385 



et al. 1994 "The JumpStart sequence: a 39 bp element 
common to several polysaccharide gene clusters" Mol . 
Microbiol. 12: 855-856], and another primer #482 (cac tgc 
cat acc gac gac gcc gat ctg ttg ctt gg) based on the gnd 
gene usually found downstream of the O antigen gene 
cluster. Long PCR was carried out using the Expand Long 
Template PCR System from Boehringer Mannheim (Castle Hill 
NSW Australia), and products, 14 kb in length, from 
several reactions were combined and purified using the 
Promega Wizard PCR preps DNA purification System (Madison 
WI USA) . The PCR product was then extracted with phenol 
and twice with ether, precipitated with 70% ethanol, and 
resuspended in 40mL of water. 
B. Construction of a random DNase I bank: 

Two aliquots containing about 150ng of DNA each were 
subjected to DNase I digestion using the Novagen DNase I 
Shotgun Cleavage (Madison WI USA) with a modified 
protocol as described. Each aliquot was diluted into 
45ml of 0.05M Tris -HCl (pH7.5), 0.05mg/mL BSA and lOmM 
MnClz. SmL of 1:3000 or 1:4500 dilution of DNasel 
(Novagen) (Madison WI USA) in the same buffer was added 
into each tube respectively and 10ml of stop buffer 
(lOOmM EDTA), 30% glycerol, 0.5% Orange G, 0.075% xylene 
and cyanol (Novagen) (Madison WI USA) was added after 
incubation at 15°C for 5 min. The DNA from the two DNasel 
reaction tubes were then combined and fractionated oh a 
0.8% LMT agarose gel, and the gel segment with DNA of 
about Ikb in size (about 1 . SmL agarose) was excised. DNA 
was extracted from agarose using Promega Wizard PCR Preps 
DNA Purification (Madison WI USA) and resuspended in 2 00 
mL water, before being extracted with phenol and twice 
with ether, and precipitated. The DNA was then 

resuspended in 17.25 mL water and subjected to T4 DNA 
polymerase repair and single dA tailing using the Novagen 
Single dA Tailing Kit (Madison WI USA) . The reaction 
product (85ml containing about Bng DNA) was then 
extracted with chloroform: isoamyl alcohol (24:1) once and 



wo 99/61458 



- 60 - 



PCT/AU99/00385 



ligated to 3x 10"^ pmol pGEM-T (Promega) (Madison WI USA) 
in a total volume of lOOmL. Ligation was carried out 
overnight at 4°C and the ligated DNA was precipitated and 
resuspended in 2 0mL water before being electroporated 
into E. coli strain JM109 and plated out on BCIG-IPTG 
plates to give a bank. 

C. Sequencing 

DNA templates from clones of the bank were prepared 
for sequencing using the 96-well format plasmid DNA 
miniprep kit from Advanced Genetic Technologies Corp 
(Gaithersburg MD USA) The inserts of these clones were 
sequenced from one or both ends using the standard M13 
sequencing primer sites located in the pGEM-T vector. 
Sequencing was carried out on an ABI377 automated 
sequencer (CA USA) as described above, after carrying out 
the sequencing reaction on an ABI Catalyst (CA USA) . 
Sequence gaps and areas of inadequate coverage were PGR 
amplified directly from 0157 chromosomal DNA using 
primers based on the already obtained sequencing data and 
sequenced using the standard M13 sequencing primer sites 
attached to the PGR primers. 

D. Analysis of the E. coli 0157 O antigen gene cluster 
nucleotide sequence data 

Sequence data were processed and analysed using the 
Staden programs [Staden, R. , 1982 ^^Automation of the 
computer handling of gel reading data produced by the 
shotgun method of DNA sequencing." Nuc . Acid Res. 10: 
4731-4751; Staden, R. , 1986 ^^The current status and 
portability of our sequence handling software". Nuc. Acid 
Res, 14: 217-231; Staden, R. 1982 -An interactive 
graphics program for comparing and aligning nucleic acid 
and amino acid sequence". Nuc. Acid Res. 10: 2951-2961]. 

Figure 4 shows the structure of E. coli 0157 O antigen 
gene cluster. Twelve open reading frames were predicted 
from the sequence data, and the nucleotide and amino acid 
sequences of all these genes were then used to search the 
GenBank database for indication of possible function and 
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specificity of these genes. The position of each gene is 
listed in Table 9. The nucleotide sequence is presented 
in Figure 6 . 

oris 10 and 11 showed high level identity to manC 
and manB and were named manC and manB respectively, orfl 
showed 89% identity (at amino acid level) to the gmd gene 
of the coli colanic acid capsule gene cluster 

(Stevenson G., K. et al . 1996 ^^Organisation of the 
Escherichia coli K-12 gene cluster responsible for 
production of the extracellular polysaccharide colanic 
acid". J. Bacterid. 178:4885-4893) and was named gmd, 
orf8 showed 79% and 69% identity (at amino acid level) 
respectively to wcaG of the E, coli colanic acid capsule 
gene cluster and to wbcJ (or f 14. 8) gene of the Yersinia 
enterocolitica OS O antigen gene cluster (Zhang, L. et 
ai. 1997 "'Molecular and chemical characterization of the 
lipopolysaccharide 0-antigen and its role in the 
virulence of y. enterocolitica serotype 08".Mol. 
Microbiol. 23:63-76). Colanic acid and the Yersinia OS O 
antigen both contain fucose as does the 0157 O antigen. 
There are two enzymatic steps required for GDP-L-fucose 
synthesis from GDP-4-keto-6-deoxy-D-mannose , the product 
of the gind gene product. However, it has been shown 
recently (Tonetti, M et al . 1996 Synthesis of GDP-L- 
fucose by the human FX protein J. Biol. Chem. 271:27274- 
27279) that the hioman FX protein has "significant 
homology" with the wcaG gene (referred to as yefJb in that 
paper) , and that the FX protein carries out both 
reactions to convert GDP-4-keto-6-deoxy-D-mannose to GDP- 
L-fucose. We believe that this makes a very strong case 
for orfS carrying out these two steps and propose to name 
the gene fcl. In support of the one enzyme carrying out 
both functions is the observation that there are no genes 
other than manB, manC, gmd and fcl with similar levels of 
similarity between the three bacterial gene clusters for 
fucose containing structures. 

orf5 is very similar to wheE (rfhE) of Vihrio 
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cholerae 01, which is thought to be the perosamine 
synthetase, which converts GDP-4-keto-6-deoxY-D-inannose 
to GDP-perosamine (Stroeher, U.H et al . 1995 "A putative 
pathway for perosamine biosynthesis is the first function 
encoded within the rfb region of Vibrio cholerae" 01. 
Gene 166: 33-42). v. cholerae Ol and E. coli 0157 O 
antigens contain perosamine and N-acetyl -perosamine 
respectively. The V. cholerae 01 manA, manB, gmd and 
wheE genes are the only genes of the V. cholerae 01 gene 
cluster with significant similarity to genes of the E. 
coli 0157 gene cluster and we believe that our 
observations both confirm the prediction made for the 
function of wbe of v. cholerae, and show that orfS of the 
0157 gene cluster encodes GDP-perosamine synthetase. 
orf5 is therefore named per. orf5 plus about lOObp of 
the upstream region (postion 4022-53 08) was previously 
sequenced by Bilge, S.S. et al . [1996 «Role of the 
Escherichia coli 0157-H7 O side chain in adherence and 
analysis of an rfb locus" . Infect . Immun. 64:4795-4801]. 

orfl2 shows high level similarity to the conserved 
region of about 50 amino acids of various members of an 
acetyltransferase family (Lin, W. , et al. 1994 "Sequence 
analysis and molecular characterisation of genes required 
for the biosynthesis of type 1 capsular polysaccharide in 
Staphylococcus aureus", j. Bateriol. 176: 7005-7016) and 
we believe it is the N-acetyltransf erase to convert GDP- 
perosamine to GDP-perNAc. orfl2 has been named wbdR. 

The genes manB. manC, gmd, fcl, per and wbdR account 
for all of the expected biosynthetic pathway genes of the 
0157 gene cluster. 

The remaining biosynthetic step(s) required are for 
synthesis of UDP-GalNAc from UDP-Glc . it has been 
proposed (Zhang, L., et al. 1997 "Molecular and chemical 
characterisation of the lipopolysaccharide O-antigen and 
Its role in the virulence of Yersinia enter ocolitica 
serotype 08".Mol. Microbiol. 23:63-76) that in Yersinia 
enterocolitica UDP-GalNAc is synthesised from UDP-GlcNAc 
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by a homologue of galactose epimerase (GalE) , for which 
there is a galE like gene in the Yersinia enterocolitica 
08 gene cluster. In the case of 0157 there is no galE 
homologue in the gene cluster and it is not clear how 
UDP-GalNAc is synthesised. It is possible that the 
galactose epimerase encoded by the galE gene in the gal 
operon, can carry out conversion of UDP-GlcNAc to UDP- 
GalNAc in addition to conversion of UDP-Glc to UDP-Gal . 
There do not appear to be any gene{s) responsible for 
UDP-GalNAc synthesis in the 0157 gene cluster. 

orf4 shows similarity to many wzx genes and is named 
wzx and orf2 which shows similarity of secondary 
structure in the predicted protein to other wzy genes and 
is for that reason named wzy. 

The orfl, orf3 and orfG gene products all have 
characteristics of transferases, and have been named 
whdN, wbdO and whdP respectively. The 0157 O antigen has 
4 sugars and 4 transferases are expected. The first 
transferase to act would put a sugar phosphate onto 
undecaprenol phosphate. The two transferases known to 
perform this function, WbaP (RfbP) and WecA (Rfe) 
transfer galactose phosphate and N-acetyl-glucosamine 
phosphate respectively to undecaprenol phosphate . 
Neither of these sugars is present in the 0157 structure. 

Further, none of the presumptive transferases in the 
0157 gene cluster has the transmembrane segments found in 
WecA and WbaP which transfer a sugar phosphate to 
undecaprenol phosphate and expected for any protein which 
transferred a sugar to undecaprenol phosphate which is 
embedded within the membrane. 

The WbcA gene which transfers GlcNAc-P to 
undecaprenol phosphate is located in the Enterobactereal 
Common Antigen (EGA) gene cluster and it functions in EGA 
synthesis in most and perhaps all E. coli strains, and 
also in O antigen synthesis for those strains which have 
GlcNAc as the first sugar in the O unit. 

It appears that WecA acts as the transferase for 
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addition of GalNAc-l-P to undecaprenol phosphate for the 
Yersinia enterocolitica 08 O antigen [Zhang et al.l997 
"Molecular and chemical characterisation of the 
lipopolysaccharide O antigen and its role in the 
virulence of Yersinia enterocolitica serotype 08" Mol . 
Microbiol. 23: 63-76.] and perhaps does so here as the 
0157 structure includes GalNAc . WecA has also been 
reported to add Glucose-l-P phosphate to undecaprenol 
phosphate in E. coli 08 and 09 strains, and an 
alternative possibility for transfer of the first sugar 
to undecaprenol phosphate is WecA mediated transfer of 
glucose, as there is a glucose residue in the 0157 o 
antigen. m either case the requisite number of 

transferase genes are present if GalNAc or Glc is 
transferred by WecA and the side chain Glc is transferred 
by a transferase outside of the O antigen gene cluster. 

orf9 shows high level similarity (44% identity at 
amino acid level, same length) with i^^caH gene of the E. 
coli colanic acid capsule gene cluster. The function of 
this gene is unknown, and we give orf9 the name wbdQ. 

The DNA between mauB and wdhR has strong sequence 
similarity to one of the H-repeat units of E. coli K12 . 
Both of the inverted repeat sequences flanking this 
region are still recognisable, each with two of the 11 
bases being changed. The H-repeat associated protein 
encoding gene located within this region has a 267 base 
deletion and mutations in various positions. it seems 
that the H-repeat unit has been associated with this gene 
cluster for a long period of time since it translocated 
to the gene cluster, perhaps playing a role in assembly 
of the gene cluster as has been proposed in other cases. 

Materials and Methods - part 4 

To test our hypothesis that O antigen genes for 
transferases and the wzx. wzy genes were more specific 
than pathway genes for diagnostic PGR, we first carried 
out PGR using primers for all the E. coli 016 O antigen 
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genes (Table 7) . The PGR was then carried out using PGR 
primers for E.coli 0111 transferase, wzx and wzy genes 
(Table 8, 8A) . PGR was also carried out using PGR 
primers for the E, calx 0157 transferase, wzk and wzy 
genes (Table 9, 9A) . 

Ghromosomal DNA from the 166 serotypes of E. coli 
available from Statens Serum Institut, 5 Artillerivej , 
23 00 Gopenhagen Denmark was isolated using the Promega 
Genomic (Madison WI USA) isolation kit. Note that 164 of 
the serogroups are described by Ewing W, H. : Edwards and 
Ewings ^^Identification of the Enterobacteriacea" 
Elsevier, Amsterdam 1986 and that they are numbered 1-171 
with numbers 31, 47, 67, 72, 93, 94 and 122 no longer 
valid. Of the two serogroup 19 strains we used 19ab 
strain F8188-41. Lior H. 1994 [ "Glassif ication of 

Escherichia coli In Escherichia coli in domestic animals 
and humans pp 31-72. Edited by C.L. Gyles CAB 
international] adds two more numbered 172 and 173 to give 
the 166 serogroups used. Pools containing 5 to 8 samples 
of DNA per pool were made. Pool numbers 1 to 19 (Table 
4) were used in the E. coli 0111 and 0157 assay. Pool 
numbers 2 0 to 28 were also used in the 0111 assay, and 
pool n\imbers 22 to 24 contained £. coli 0111 DNA and were 
used as positive controls (Table 5) . Pool numbers 29 to 

42 were also used in the 0157 assay, and pool numbers 31 
to 36 contained E. coli 0157 DNA, and were used as 
positive controls (Table 6). Pool numbers 2 to 20, 30, 

43 and 44 were used in the E, coli 016 assay (Tables 4 to 
6) . Pool number 44 contained DNA of E, coli K-12 strains 
G60 0 and WGl and was used as a positive control as 
between them they have all of the E. coli K-12 016 O 
antigen genes . 

PGR reactions were carried out under the following 
conditions: denaturing 94^G/30" ; annealing, temperature 
varies (refer to Tables)/30"; extension, 72°G/1'; 30 
cycles. PGR reaction was carried out in an volume of 
25mL for each pool. After the PGR reaction, lOmL PGR 
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product from each pool was run on an agarose gel to check 
for amplified DNA. 

Each E. coli chromosomal DNA sample was checked by 
gel electrophoresis for the presence of chromosomal DNA 
and by PGR amplification of the E. coli mdh gene using 
oligonucleotides based on E. coli K-12 [Boyd et al . 
(1994) "Molecular genetic basis of allelic polymorphism 
in malate degydrogenase {mdh) in natural populations of 
Escherichia coli and Salmonella enterica " Proc . Nat. 
Acad. Sci. USA. 91:1280-1284.] Chromosomal DNA samples 
from other bacteria were only checked by gel 
electrophoresis of chromosomal DNA. 

A. Primers based on E. coli 016 O antigen gene cluster 
sequence . 

The O antigen gene cluster of E. coli 016 was the 
only typical E. coli O antigen gene cluster that had been 
fully sequenced prior to that of Gill, and we chose it 
for testing our hypothesis. One pair of primers for each 
gene was tested against pools 2 to 20, 30 and 43 of E. 
coli chromosomal DNA. The primers, annealing 

temperatures and functional information for each gene are 
listed in Table 8 . 

For the five pathway genes, there were 17/21, 13/21, 
0/21, 0/21, 0/21 positive pools for rmlB, rmlD, rmlA, 
rmlc and glf respectively (Table 7). For the wzx, wzy 
and three transferase genes there were no positives 
amongst the 21 pools of E. coli chromosomal DNA tested 
(Table 7) . In each case the #44 pool gave a positive 
result . 

B. Primers based on the E. coli 0111 O antigen gene 
cluster sequence. 

One to four pairs of primers for each of the 
transferase, wzx and wzy genes of Olll were tested 
against the pools 1 to 21 of E. coli chromosomal DNA 
(Table 8). For whdH, four pairs of primers, which bind 
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to various regions of this gene, were tested and found to 
be specific for 0111 as there was no amplified DNA of the 
correct size in any of those 21 pools of E. coli 
chromosomal DNA tested. Three pairs of primers for wbdM 
were tested, and they are all specific although primers 
#985/#986 produced a band of the wrong size from one 
pool. Three pairs of primers for wzx were tested and 
they all were specific. Two pairs of primers were tested 
for wzy, both are specific although #980/#983 gave a band 
of the wrong size in all pools. One pair of primers for 
whdL was tested and found unspecific and therefore no 
further test was carried out. Thus, wzk, wzy and two of 
the three transferase genes are highly specific to Olll. 

Bands of the wrong size found in amplified DNA are 
assumed to be due to chance hybridisation of genes widely 
present in E. coll. The primers, annealing temperatures 
and positions for each gene are in Table 8. 

The 0111 assay was also performed using pools 
including DNA from O antigen expressing Yersinia 
pseudotuberculosis, Shigella boydii and Salmonella 

enterica strains (Table 8A) . None of the 

oligonucleotides derived from wbdH, wzx, wzy or wbdM gave 
amplified DNA of the correct size with these pools. 
Notably, pool number 25 includes S. enterica Adelaide 
which has the same O antigen as E. coli 0111: this pool 
did not give a positive PGR result for any primers tested 
indicating that these genes are highly specific for E. 
coli 0111. 

Each of the 12 pairs binding to wbdH, wz:>c, wzy and 
WbdM produces a band of predicted size with the pools 
containing 0111 DNA (pools number 22 to 24). As pools 22 
to 24 included DNA from all strains present in pool 21 
plus 0111 strain DNA (Table 5), we conclude that the 12 
pairs of primers all give a positive PGR test with each 
of three unrelated 0111 strains but not with any other 
strains tested. Thus these genes are highly specific for 
E. coli 0111. 
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C. Primers based on the E, coli 0157 O antigen gene 
cluster sequence. 

Two or three primer pairs for each of the 
transferase, wz:k and wzy genes of 0157 were tested 
against E. coli chromosomal DNA of pools 1 to 19, 2 9 and 
30 (Table 9). For whdN, three pairs of primers, which 
bind to various regions of this gene, were tested and 
found to be specific for 0157 as there was no amplified 
DNA in any of those 21 pools of E, coli chromosomal DNA 
tested. Three pairs of primers for whdO were tested, and 
they are all specific although primers # 1211/#1212 
produced two or three bands of the wrong size from all 
pools. Three pairs of primers were tested for whdP and 
they all were specific. Two pairs of primers were tested 
for wbdR and they were all specific. For wzy, three 
pairs of primers were tested and all were specific 
although primer pair #1203/#1204 produced one or three 
bands of the wrong size in each pool. For wzx, two pairs 
of primers were tested and both were specific although 
primer pair #1217/#1218 produced 2 bands of wrong size in 
2 pools, and 1 band of wrong size in 7 pools. Bands of 
the wrong size found in amplified DNA are assumed to be 
due to chance hybridisation of genes widely present in E. 
coli. The primers, annealing temperatures and function 
information for each gene are in Table 9 . 

The 0157 assay was also performed using pools 3 7 to 
42, including DNA from O antigen expressing YBrsinia, 
pseudotuberculosis, Shigella boydii. Yersinia 

enterocolitica 09, Brucella abortus and Salmonella 
enterica strains (Table 9A) . None of the 

oligonucleotides derived from wbdN, wzy, wbdO, wzx, wbdP 
or WbdR reacted specifically with these pools, except 
that primer pair #1203/ #12 04 produced two bands with Y. 
enterocolitica 09 and one of the bands is of the same 
size with that from the positive control. Primer pair 
#1203/#1204 binds to wzy. The predicted secondary 



wo 99/61458 



- 69 - 



PCT/AU99/00385 



Structures of Wzy proteins are generally similar, 
although there is very low similarity at amino acid or 
DNA level among the sequenced v^zy genes. Thus, it is 
possible that Y, enterocolitica 09 has a wzy gene closely 
related to that of E. coli 0157. It is also possible 
that this band is due to chance hybridization of another 
gene, as the other two wzy primer pairs (#1205/#1206 and 
#12 07/ #1208) did not produce any band with Y. 
enterocolitic^ 09. Notably, pool number 37 includes S. 
enterica Landau which has the same O antigen as E. coli 
0157, and pool 3 8 and 3 9 contain DNA of B. abortus and Y. 
enterocolitica 09 which cross react serologically with E. 
coli 0157. This result indicates that these genes are 
highly 0157 specific, although one primer pair may have 
cross reacted with y. enterocolitica 09. 

Each of the 16 pairs binding to whdN, wzk, wzy, 
whdO, wbdP and wbdR produces a band of predicted size 
with the pools containing 0157 DNA (pools number 31 to 
3 6) . As pool 29 included DNA from all strains present in 
pools 31 to 36 other than 0157 strain DNA (Table 6), we 
conclude that the 16 pairs of primers all give a positive 
PGR test with each of the five unrelated 0157 strains. 

Thus PGR using primers based on genes wbdN, wzy, 
wbdO, wzx, wbdP and wbdR is highly specific for E. coli 
0157, giving positive results with each of six unrelated 
0157 strains while only one primer pair gave a band of 
the expected size with one of three strains with O 
antigens known to cross-react serologically with E. coli 
0157 . 
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TABLE 1 

H7 Strains used in this work in addition to the H 
antigens type strains 



Name used 


Serotype 


Original 


Sour< 


in this 




name 




study 








M527 


0157 :H7 


C664-1992 


a 


M917 


OlBac :H7 


AS 7 


IMVS 


M918 


018ac:H7 


A62 


IMVS 


M973 


02 :H7 


A1107 


CDC 


M1004 


0157 :H7 


EH7 


b 


M1179 


018ac:H7 


D-M3291/54 


IMVS 


M1200 


07:H7 


A64 


c 


M1211 


019ab:H7 


F8188-41 


IMVS 


M1328 


053 :H7 


14097 


IMVS 


M1686 


055:H7 


TBI 5 6 


d 



* 

a , 



Statens Seriim Ins ti tut, Copenhagen, Denmark. 

Dr R, Brown of Royal Children's Hospital, Melbourne, 
Australia. 



IMVS, 
CDC, 



Max-Planck Institut fur molekulare Genetik, Berlin, 
Germany . 

Dr P. Tarr of Children's Hospital and Medical Center, 
University of Washington, USA. 

Institute of Medical and veterinary Science, Adelaide, 
Australia . 

Centers for Disease Control and prevention, Atlanta, 
USA. 
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Table 2 



Oligonucleotides used to PGR amplify fUC genes 
from different H type strains for sequencing 


H Type Strains 


Annealing Temperature TC) 


Primers Used 


1 


55 


#1575/#1576 


2 


55 


#1285/#1286 


3 


55 


#1285/#12a6 


4 


50 


$1431/#14'^? 


5 


60 




6 


5*5 


tr 1 O / O/tt I O / O 


7 




fr 1 O / O/tr 1 O/O 


8 




it 1 A*^ 1 /if 1 4*30 


9 


RO 
ov 


ff 1 O/ O/ff 1 O/D 


10 




tr 1 O / Of if 1 O/O 


11 


55 


TT I^OO/TT 1 ^OO 


12 




Tf lO/O/tF lo/b 


14 


fin 


ii1 i^T^/Jil c:7A 


15 


fin 


ffio/o/ffio/b 


16 


fin 


ff lO/O/fflO/O 


17 


fin 


if 1 A1 7/iH >1 1 Q 
#141 //ffl4 lo 


18 


fin 


i*1 R7*;/it1 C7A 


19 


fin 


ii1 f;7K/i*'l C7A 


20 


fin 


if 1 (i7R/ii"l K7A 


21 


OO 


ff I^OO/fF I^OO 


23 


fin 


tf 1 D/O/ffl O/D 


24 


fin 


ff l^OO/ff l^OO 


25 


ou 


#141 //ffl4 lo 


26 


ou 


C7C/-M-i C"7C 


27 


ou 


Ml >f >IOO 


28 


fin 
ou 


1 C:7C/#1 C7C 
ff 1 O/ O/ff 1 O/O 


29 


ou 


i*1 OQC/Jfl OOA 
ffl^OO/# l£00 


30 


fin 
ou 


iil (^7K/it1 C7A 


31 


OU 


U-i C7C/-W^ C7C 

ffib/o/ff io7d 


32 


OU 


tf 1 C7C/tf i C7C 


33 


OU 




34 


oo 


tf 1 c;7*^ytf 1 C7R 


35 


^n 


jfi A'ii/iti4'ao 


37 


fin 


w 1 (COO/tF 1 ^OO 


38 


fin 


if 1 9 A<^/i£1 OAA 


39 




if10AC;/if10AA 


40 




if 1 ORC^yif 1 0fiA 


41 


fin 


tf 1 i;7C;/tf 1 K7A 
ff 1 O/ O/ff 1 O/O 


42 


60 


#1285y#1286 


43 


60 


#1575/#1576 


44 


60 


#1285/#1286 


45 


60 


#1575/#1576 


46 


60 


#1575/#1576 


47 


55 


#1285/#1286 


48 


60 


#1575/#1576 


49 


60 


#1575/#1576 


50 


60 


#1285/#1286 


51 


60 


#1575/#1576 


52 


60 


#1575/#1576 


54 


50 


#1431/#1432 


55 


60 


#1285/#1286 


56 


60 


#1285/#1286 
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Table 3 

Summary of the flagellin sequences obtained and specific H type 
oligonucleotide primers 



H type strain(s) the 
sequenced gene(s) 
obtained from 


H specificity 
coded by the 
Qene(s) 


H tvnA strain from 
which thp flAHPllin np^nfa 
senusncp \a/aq ijqaH fnr 

v^Vif Ud Iww vV do LIOCU 1 \J 1 

r>rimpr ohnifp 


"osiiions OT 
priiTicr ) 


1 ■ 1 

Positions of 

primer 2 


1 


1 


1 


oy^-yuy 


1172-1189 


2 


2 


0 


ceo CO? 


1039-1056 


4,17,44 


4 


A 
*+ 


4ob-4oo 


628-648 


5 


5 


C 

O 


697-714 


877-897 


6 


Q 


o 
D 


565-585 


799-816 


7 


7 


7 
/ 


553-570 
(primer #1806) 


1483-1500 
(primer #1809) 


9 


9 


Q 


c -i c- coo 

616-633 


838-855 




10 


lU 


559-579 


697-717 


11 


1 "I 


1 1 


586-606* 


791-810* 


12 


19 


12 


892-909 


1172-1189 


14 




14 


586-606 


793-813 


15 


1 ^ 


15 


640-660 


817-834 


3 


1fi 


3 


649-666 


925-942 


18 


1 o 


18 


589-606 


802-819 


19 


1Q 


19 


607-624 


538-855 


20 




20 


574-591 


760-780 


21,47 




21 


676-693** 


862-879** 


23 




23 


637-654 


1336-1353 


24 




24 


496-516 


772-792 


26 


9R 


2o 


553-570 


772-789 


27 


07 




685-702 


799-819 


28 


28 




592-609 


778-798 


29 


29 


9Q 


538-555 


757-774 


30 


30 


»3U 


814-831 


943-962 


31 


31 


O 1 


571-588 


790-807 


32 


32 




b 14-831 


1057-1074 


33 


33 


'^'^ 


ceo c"7n 
ooo-b/0 


718-735 


34 


34 


34 


cetQ coc 

boo-bob 


796-816 


38.55 


38 


38 




709-729 


39 


39 


39 


bbo-b/v> 


718-735 


41 


41 


41 


cQo cic 


784-801 


42 


42 


42 


0*l-A-bD/ 


715-735 


43 


43 


43 


580-597 


844-861 


45 


45 


45 


640-657 


943-963 


46 


46 


46 


565-582 


781-801 


49 


49 


49 


589-609 


754-771 


51 


51 


51 


565-582 


1042-1059 


52 


52 


52 


598-615 


829-846 


56 


56 


56 


697-714 


877-897 


8 and 40 




8 


562-579 


1045-1062 


25 




25 


529-549 


703-723 


35 




non-functional mi qene 


769-789* 


1045-1065* 


37 




37 


520-537 


715-735 


48 




48 


568-585 


835-852 


54 




non-functional H21 aene 


988-1008"* 


1344-1364** 


See section 13 tor choice of primers for the flagellin gene of H1 1 
See section 1 3 for choice of primers for the flagellin gene of H21 
See text 
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Table 3A 

Cloning, expression and identification of flagellln genes 



6 
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Table 3b Oligonucleotide primers used for PGR amplification and 
cloning of H antigen genes 

#1868 5'- cat g cc atg p ea caa gtc att aat acc -3' 
Ncol 

#1869 5'- ata tgt cea ctt aac cct gca gca gag aca g -3' 
SaR 

#1870 5' - a tg gat cc t taa ccc tgc age aga gac ag -3' 
Bamm 

#1 87 1 5' - aac t^c agt taa ccc tgt age aga gac ag -3' 
Pstl 

#1872 5' - egg gat ccc gca gac tgg ttc ttg ttg at - 3' 
Bamm 

#1878 5' - eg g gat cca ctt eta teg age gcc tet et - 3' 

#1884 5' - get eta gag cgc aga tea ttc age agg cc -3" 
Xbal 

#1885 5' - get eta pac atg ttg gac act teg gtc gc - 3' 
Xbal 
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TABLE 4 



Pool 
No. 


Strains of which chromosonal DNA included in the pool 


Source* 


1 


£. coli type strains for O serotypes 1, 2, 3, 4, 10, 16, 18 and 39 


IMVS- 


2 


£. coli type strains for O serotypes 40, 41, 48, 49, 71, 73, 88 and 100 


IMVS 


3 


£. coli type strains for O serotypes 102, 109, 119, 120, 121, 125, 126 and 
137 


IMVS 


4 


E. coli tVDe strains for O <>prntvr)p<; 1 "Hft 1 "^9 14Q 7 R ^ 1 1 anH 1 9 


iNlVj 


5 


E. co/f tvne strains for O sprofvnp*; 1"^ 14 IS 17 1Q:^V» 90 91 anH 99 


IIYI Vo 


6 


£. co/z tVDe strains for O <;prntvnpc; 9*^ 94 9R 9A 97 9R 9Q anH '^n 


ilYlVb 


7 


E coZi tvne <;train<; fr»r O QpmK/npc '^9 "^A '^A '^7 '^ft A9 

^-^^t* Jr o»-Aaiiio w aciLH.ypt:i> OO, OD/ OO, O/ f Oo anQ ^Z 


iMVb 


8 


£. coli type strains for O serotypes 43, 44, 45, 46, 50, 51, 52 and 53 


IMVS 


9 


£, coli type strains for O serotypes 54, 55, 56, 57, 58, 59, 60 and 61 


IMVS 


10 


x:,. Luii type strains ror kj serotypes oZ, o3, o4, oo, oo, oo, 69 and 70 


IMVS 


11 


E. coli type strains for O serotypes 74, 75, 76, 77, 78, 79, 80 and 81 


IMVS 




E. co/f type strains for O serotypes 82, 83, 84, 85, 86, 87, 89 and 90 


IMVS 


13 


E. coli type strains for O serotypes 91, 92, 95, 96, 97, 98, 99 and 101 


IMVS 


14 


E. coli type strains for O serotypes 103, 104, 105, 106, 107, 108 and 110 


IMVS 


15 


£. coli type strains for O serotypes 112, 162, 113, 114, 115, 116, 117 and 
118 


IMVS 


16 


E. coli type strains for O serotypes 123, 165, 166, 167, 168, 169, 170 and 
171 


Seeb 


17 


£. coli type strains for O serotypes 172, 173, 127, 128, 129, 130, 131 and 
132 


Seec 


18 


E. coli type strains for O serotypes 133, 134, 135, 136, 140, 141, 142 and 
143 


IMVS 


19 


E. coli type strains for O serotypes 144, 145, 146, 147, 148, 150, 151 and 
152 


IMVS 



a. Institute of Medical and Veterinary Science, Adelaide, Australia 

b. 123 from IMVS; the rest from Statens Serum Institut, Copenhagen, Denmark 

c. 172 and 173 from Statens Serum Institut, Copenhagen, Denmark, the rest from 
IMVS 
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TABLE 5 





Pool 
No. 


Strains of which chromosonal DNA included in the pool 


Source* 




20 


E. coli type strains for O serotypes 153, 154, 155, 156, 157, 158 , 159 and 

160 


IMVS 


'it. i! 

•v= 


21 


£. coli type strains for O serotypes 161, 163, 164, 8, 9 and 124 


IMVS 


ft 


22 


As pool #21, plus £. coli 0111 type strain Stoke W. 


IMVS 


23 


As pool #21, plus E. coli 0111:H2 strain C1250-1991 


See d 




24 


As pool #21, plus E. coli 0111:H12 strain C156-1989 


See e 




25 


As pool #21, plus S. enterica serovar Adelaide 


Seef 




26 


Y pseudotuberculosis strains of O groups lA, IIA, IIB, IIC, III, IVA, IVB, 
VA, VB, VI and VII 


Seeg 




27 


S. boydii strains of serogroups 1, 3, 4, 5, 6, 8, 9, 10, 11, 12, 14 and 15 


See h 


1',%' 


28 


S. enterica strains of serovars (each representing a different O group) 
Typhi, Montevideo, Ferruch, Jangwani, Raus, Hvittingfoss, Waycross, 
Dan, Dugbe, Basel, 65,:i:e,n,z,15 and 52:d:e,n,x,zl5 


IMVS 



e. 



C1250-1991 from Statens Serum Institute Copenhagen, Denmark 
C156-1989 from Statens Serum Institut, Copenhagen, Denmark 
f- S. enterica serovar Adelaide from IMVS 
g. Dr S Aleksic of Institute of Hygiene, Germany 

Dr J Lefebvre of Bacterial Identification Section, Laboratoroie de Sante Publique 
du Quebec, Canada ^ 



h 
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TABLE 6 
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Pool 
No. 



Strains of which chromosonal DNA included in the pool 



Source* 



29 
30 
31 
32 
33 
34 
35 
36 
37 
38 

39 
40 

41 
42 

43 
44 



E. coli type strains for O serotypes 153, 154, 155, 156, 158, 159 and 160 

E. coli type strains for O serotypes 161, 163, 164, 8, 9, 111 and 124 

As pool #29, plus E. coli 0157 type strain A2 (0157:H19) 

As pool #29, plus E. coli 0157:H16 strain C475-89 

As pool #29, plus E. coli 0157:H45 strain C727-89 

As pool #29, plus E. coli 0157:H2 strain C252-94 

As pool #29, plus E. coli 0157:H39 strain C258-94 

As pool #29, plus E. coli Ol57:H26 

As pool #29, plus S. enterica serovar Landau 

As pool #29, plus Brucella abortus 

As pool #29, plus Y. enterocolitica 09 

y. pseudotuberculosis strains of O groups lA, IIA, IIB, IIC, III, IVA, IVB, VA 
VB, VI and VII ' 

S. hoydii strains of serogroups 1, 3, 4, 5, 6, 8, 9, 10, 11, 12, 14 and 15 

S. enterica strains of serovars (each representing a different O group) Typhi, 
Montevideo, Ferruch, Jangwani, Raus, Hvittingfoss, Waycross, Dan, Dugbe, 
Basel, 65:i:e,n,zl5 and 52:d:e,n,x,zl5 

E. coli type strains for O serotypes 1,2,3,4,10,18 and 29 

As pool #43, plus E. coli K-12 strains C600 and WGl 



IMVS 

IMVS 

IMVS 

See d 

See d 

See d 

See d 

See e 

Seef 

Seeg 
See h 

See i 

See j 
IMVS 

IMVS 

IVMS 
See k 



d. 
e. 
f. 

g- 
h. 

i. 
J. 



0157 strairis from Stater\s Serum Institut, Copenhagen, Denmark 

0157:H26 from Dr R Brown of Royal Children's Hospital, Melbourne, Victoria 

S. enterica serovar Landau from Dr M Poppoff of Institut Pasteur, Paris, France 

B. Abortus from the culture collection of The University of Sydney, Sydney, Australia 

Y. enterocolitica 09 from Dr. K. Bettelheim of Victorian Infectious Diseases Reference 

Laboratory Victoria, Australia. 

Dr S Aleksic of Institute of Hygiene, Germany 

Dr J Lefebvre of Bacterial Identification SecHon, Laboratoroie de Sante Publique du 
Quebec, Canada 

Strains C600 and WGl from Dr. B.J. Backmann of Department of Biology, Yale 
University, USA. 
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